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(54) Tide: EXPRESSION OF FRUCTOSE 1,6 BISPHOSPHATE ALDOLASE IN TRANSGENIC PLANTS 



(57) Abstract 



Fructose- 1,6-bisphosphate aldolase (FDA) is an enzyme reversibly catalyzing the reaction converting triosephosphate into 
fructose- 1,6-bisphosphate. In the leaf, this enzyme is located in the chloroplast (starch synthesis) and the cytosol (sucrose biosynthesis). 
Transgenic plants were generated that expess the E. coli fda gene in the chloroplast to improve plant yield by increasing leaf starch 
biosynthetic ability in particular and sucrose production in general. Leaves from plants expressing Xhtfda transgene showed a significandy 
higher starch accumulation, as compared to control plants expressing the null vector, particularly early in the photoperiod, but had lower leaf 
sucrose. Transgenic plants also had a significantly higher root mass. Furthermore, transgenic potatoes expressing fda exhibited improved 
uniformity of solids. 
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EXPRESSION OF FRUCTOSE 1.6 BISPHOSPHATE 
ALDOLASE IN TRANSGENIC PLANTS 

This invention relates to the expression of fructose 1,6 bisphosphate aldolase 
(FDA) in transgenic plants to increase or improve plant growth and development, yield, 

5 vigor, stress tolerance, carbon allocation and storage into various storage pools, and 
distribution of starch. Transgenic plants expressing FDA have increased carbon 
assimilation, export and storage in plant source and sink organs, which results in growth, 
yield and quality improvements in crop plants. 

Recent advances in genetic engineering have provided the prerequisite tools to 

10 transform plants to contain alien (often referred to as "heterologous") or improved 

endogenous genes. These genes can lead either to an improvement of an already existing 
pathway in plant tissues or to an introduction of a novel pathway to modify product levels, 
increase metabolic efficiency, and or save on energy cost to the cell. It is presently 
possible to produce plants with unique physiological and biochemical traits and 

15 characteristics of high agronomic and crop processing importance. Traits that play an 

essential role in plant growth and development, crop yield potential and stability, and crop 
quality and composition include enhanced carbon assimilation, efficient carbon storage, 
and increased carbon export and partitioning. 

Atmospheric carbon fixation (photosynthesis) by plants represents the major source 

20 of energy to support processes in all living organisms. The primary sites of photosynthetic 
activity, generally referred to as "source organs", are mature leaves and, to a lesser extent, 
green stems. The major carbon products of source leaves are starch, which represents the 
transitory storage form of carbohydrate in the chloroplast, and sucrose, which represents 
the predominant form of carbon transport in higher plants. Other plant parts named "sink 

25 organs" (e.g., roots, fruit, flowers, seeds, tubers, and bulbs) are generally not autotrophic 
and depend on import of sucrose or other major translocatable carbohydrates for their 
growth and development. The storage sinks deposit the imported metabolites as sucrose 
and other oligosaccharides, starch and other polysaccharides, proteins, and triglycerides. 
In leaves, the primary products of the Calvin Cycle (the biochemical pathway 

30 leading to carbon assimilation) are glyceraldehyde 3-phosphate (G3P) and 

dihydroxyacetone phosphate (DHAP), also known as triose phosphates (triose-P). The 
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condensation of G3P and DHAP into fructose 1,6 bisphosphate (FBP) is catalyzed 
reversibly by the enzyme fructose 1,6 bisphosphate aldolase (FDA), and various isozymes 
are known. The acidic isoenzyme appears to be chloroplastic and comprises about 85% of 
the total leaf aldolase activity. The basic isoenzyme is cytosolic. Both isoenzymes appear 
5 to be encoded by the nuclear genome and are encoded by different genes (Lebherz et aL 9 
1984). 

In the leaf, the chloroplast FDA is an essential enzyme in the Calvin Cycle, where 
its activity generates metabolites for starch biosynthesis. Removal of more than 40% of 
the plastidic aldolase enzymatic activity by antisense technology reduced leaf starch 

10 accumulation as well as soluble proteins and chlorophyll levels but also reduced plant 

growth and root formation (Sonnewald et al., 1994). In contrast, the cytosolic FDA is part 
of the sucrose biosynthetic pathway where it catalyzes the reaction of FBP production. 
Moreover, cytosolic FDA is also a key enzyme in the glycolytic and gluconeogenesis 
pathways in both source and sink plant tissues. 

15 In the potato industry, production of higher starch and uniform solids tubers is 

highly desirable and valuable. The current potato varieties that are used for french fry 
production, such as Russet Burbank and Shepody, suffer from a non-uniform deposition of 
solids between the tuber pith (inner core) and the cortex (outer core). French fry strips that 
are taken from pith tissue are higher in water content when compared to outer cortex 

20 french fry strips; cortex tissue typically displays a solids level of twenty-four percent 

whereas pith tissue typically displays a solids level of seventeen percent. Consequently, in 
the french fry production process, the pith strips need to be blanched, dried, and par-fried 
for longer times to eliminate the excess water. Adequate processing of the pith fries 
results in the over-cooking of fries from the high solids cortex. The blanching, drying, and 

25 par frying times of the french fry processor need to be adjusted accordingly to 

accommodate the low solids pith strips and the high solids cortex strips. A higher solids 
.potato with a more uniform distribution of starch from pith to cortex would allow for a 
more uniform finished fry product, with higher plant throughput and cost savings due to 
reduced blanch, dry and par-fry times. 

30 Although various fructose 1 ,6 bisphosphate aldolases have been previously 

characterized, it has been discovered that overexpression of the enzyme ii^a transgenic 
plant provides advantageous results in the plant such as increasing the assimilation, export 
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and storage of carbon, increasing the production of oils and/or proteins in the plant and 
improving tuber solids uniformity. 

The present invention provides structural DNA constructs that encode a fructose 
1,6 bisphosphate aldolase (FDA) enzyme and that are useful in increasing carbon 
5 assimilation, export, and storage in plants. 

In accomplishing the foregoing, there is provided, in accordance with one aspect of 
the present invention, a method of producing genetically transformed plants that have 
elevated carbon assimilation, storage, export, and improved solids uniformity comprising 
the steps of: 

10 (a) Inserting into the genome of a plant a recombinant, double-stranded DNA 

molecule comprising 

(i) a promoter that functions in the cells of a target plant tissue, 

(ii) a structural DNA sequence that causes the production of an RNA sequence 
that encodes a fructose 1,6 bisphosphate aldolase enzyme, 

15 (iii) a 3' non- translated DNA sequence that functions in plant cells to cause 

transcriptional termination and the addition of polyadenylated nucleotides 
to the 3' end of the RNA sequence; 

(b) obtaining transformed plant cells; and 

(c) regenerating from transformed plant cells genetically transformed plants that 
20 have elevated FDA activity. 

In another aspect of the present invention there is provided a recombinant, double- 
stranded DNA molecule comprising in sequence 

(i) a promoter that functions in the cells of a target plant tissue, 

(ii) a structural DNA sequence that causes the production of an RNA sequence 

25 that encodes a fructose 1,6 bisphosphate aldolase enzyme, 

(iii) a 3' non-translated DNA sequence that functions in plant cells to cause 

transcriptional termination and the addition of polyadenylated nucleotides to 
the 3' end of the RNA sequence. 
In a further aspect of the present invention, the structural DNA sequence that 
30 causes the production of an RNA sequence that encodes a fructose 1 ,6 bisphosphate 

aldolase enzyme is coupled with a chloroplast transit peptide to facilitate transport of the 
enzyme to the plastid. 
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In accordance with the present invention, an improved means for increasing carbon 
assimilation, storage and export in the source tissues of various plants is provided. Further 
means of improved carbon accumulation in sinks (such as roots, tubers, seeds, stems, and 
bulbs) are provided, thus increasing the size of various sinks (larger roots, tubers, etc.) and 
5 subsequently increasing yield and crop productivity. The increased carbon availability to 
these sinks would also improve composition and use efficiency in the sink (oil, protein, 
starch and/or sucrose production, and/or solids uniformity). 

Various advantages may be achieved by the aims of the present invention, 
including: 

10 First, increasing the expression of the FDA enzyme in the chloroplast would 

increase the flow of carbon through the Calvin Cycle and increase atmospheric carbon 
assimilation during early photoperiod. This would result in an increase in photosynthetic 
efficiency and an increase in chloroplast starch production (a leaf carbon storage form 
degraded during periods when photosynthesis is low or absent). Both of these responses 

15 would lead to an increase in sucrose production by the leaf and a net increase in carbon 
export during a given photoperiod. This increase in source capacity is a desirable trait in 
crop plants and would lead to increased plant growth, storage ability, yield, vigor, and 
stress tolerance. 

Second, increasing FDA expression in the cytosol of photosynthetic cells would 
20 lead to an increase in sucrose production and export out of source leaves. This increase in 
source capacity is a desirable trait in crop plants and would lead to increased plant growth, 
storage ability, yield, vigor, and stress tolerance. 

Third, expression of FDA in sink tissues can show several desirable traits, such as 
increased amino acid and/or fatty acid pools via increases in carbon flux through 
25 glycolysis (and thus pyruvate levels) in seeds or other sinks and increased starch levels as 
result of increased production of glucose 6-phosphate in seeds, roots, stems, and tubers 
.where starch is a major storage nonstructural carbohydrate (reverse glycolysis). This 
increase in sink strength is a desirable trait in crop plants and would lead to increased plant 
growth, storage ability, yield, vigor, and stress tolerance. 
30 Fourth, the invention is particularly desirable for use in the commercial production 

of foods derived from potatoes. Potatoes used for the production of frenqh fries and other 
products suffer from a non-uniform distribution of solids between the tuber pith (inner 
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core) and the cortex (outer core). Thus, french fry strips from the pith regions of such 
tubers have a low solids content and a high water content in comparison to cortex strips 
from the same tubers. Therefore, the french fry processor attempts to adjust the processing 
parameters so that the final inner strips are sufficiently cooked while the outer cortex strips 

5 are not overcooked. The results of such adjustments, however, are highly variable and 
may lead to poor quality product. Transgenic potatoes expressingjtfa will provide to the 
french-fry and potato chip processor a raw product that consistently displays a higher tuber 
solids uniformity with acceptable agronomic traits. In the french fry plant production 
process, inner pith fry strips from higher solids uniformity tubers will require less time to 

10 blanch, less time to dry to a specific solids content, and less time to par-fry before freezing 
and shipping to retail and institutional end-users. 

Therefore, with respect to potatoes, the present invention provides 1 ) a higher 
quality, more uniform finish fry product in which french fries from all tuber regions, when 
processed, are nearly the same, 2) a higher through-put in the french fry processing plant 

15 due to lower processing times, and 3) processor cost savings due to lower energy input 
required for lower blanch, dry, and par-fry times. A raw tuber product that displays a 
higher solids uniformity will also produce a potato chip that has a reduced saddle curl, and 
a reduced tendency for center bubble, which are undesirable qualities in the potato chip 
industry. Reduced fat content would also result; this would contribute to improved 

20 consumer appeal and lower oil use (and costs) for the processor. The increase in solids 
uniformity will also translate to an increase in overall tuber solids. For both the french fry 
and chipping industries, this overall tuber solids increase will also result in higher through- 
put in the processing plant due to lower processing times, and cost savings due to lower 
energy input for blanching, drying, par-frying, and finish frying. 

25 Figure 1 shows the nucleotide sequence and deduced amino acid sequence of a fructose 
1,6 bisphosphate aldolase gene from E. coll (SEQ ID No:l). 

Figure 2 shows a plasmid map for plant transformation vector pMON17524. 

30 Figure 3 shows a plasmid map for plant transformation vector pMONl 7542. 
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Figure 4 shows the change in diurnal fluctuations of sucrose, glucose, and starch levels in 
tobacco leaves expressing the fda transgene (pMON 17524) and control (pMON 17227). 
The light period is from 7:00 to 19:00 hours. Only fully expanded and non-senescing 
leaves were sampled. 

5 

Figure 5 shows a plasmid map for plant transformation vector pMON 13925. 

Figure 6 shows a plasmid map for plant transformation vector pMON 17590. 

1 0 Figure 7 shows a plasmid map for plant transformation vector pMON 1 3936. 

Figure 8 shows a plasmid map for plant transformation vector pMON17581. 

Figure 9 shows potato tuber cross-sections of improved solids uniformity Segal Russet 
15 Burbank lines (top row) versus unimproved nontransgenic Russet Burbank (bottom row). 

This invention is directed to a method for producing plant cells and plants 
demonstrating an increased or improved growth and development, yield, quality, starch 
storage uniformity, vigor, and/or stress tolerance. The method utilizes a DNA sequence 

20 encoding an fda (fructose 1 ,6 bisphosphate aldolase) gene integrated in the cellular 

genome of a plant as the result of genetic engineering and causes expression of the FDA 
enzyme in the transgenic plant so produced. Plants that overexpress the FDA enzyme 
exhibit increased carbon flow through the Calvin Cycle and increased atmospheric carbon 
assimilation during early photoperiod resulting in an increase in photosynthetic efficiency 

25 and an increase in starch production. Thus, such plants exhibit higher levels of sucrose 
production by the leaf and the ability to achieve a net increase in carbon export during a 
. given photoperiod. This increase in source capacity leads to increased plant growth that in 
turn generates greater biomass and/or increases the size of the sink and ultimately 
providing greater yields of the transgenic plant. This greater biomass or increased sink 

30 size may be evidenced in different ways or plant parts depending on the particular plant 
species or growing conditions of the plant overexpressing the FDA enzypie. Thus, 
increased size resulting from overexpression of FDA may be seen in the seed, fruit, stem, 
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leaf, tuber, bulb or other plant part depending upon the plant species and its dominant sink 
during a particular growth phase and upon the environmental effects caused by certain 
growing conditions, e.g. drought, temperature or other stresses. Transgenic plants 
overexpressing FDA may therefore have increased carbon assimilation, export and storage 
5 in plant source and sink organs, which results in growth, yield, and uniformity and quality 
improvements. 

Plants overexpressing FDA may also exhibit desirable quality traits such as 
increased production of starch, oils and/or proteins depending upon the plant species 
overexpressing the FDA. Thus, overexpression of FDA in a particular plant species may 

10 affect or alter the direction of the carbon flux thereby directing metabolite utilization and 
storage either to starch production, protein production or oil production via the role of 
FDA in the glycolysis and gluconeogenesis metabolic pathways. 

The mechanism whereby the expression of exogenous FDA modifies carbon 
relationships is believed to derive from source-sink relationships. The leaf tissue is a 

15 sucrose source, and if more sucrose resulting from the activity of increased FDA 

expression is transported to a sink, it results in increased storage carbon (sugars, starch, 
oil, protein, etc.) or nitrogen (protein, etc.) per given weight of the sink tissue. 

The expression in a plant of a gene that exists in double-stranded DN A form 
involves transcription of messenger RNA (mRNA) from one strand of the DNA by RNA 

20 polymerase enzyme, and the subsequent processing of the mRNA primary transcript inside 
the nucleus. This processing involves a 3* non-translated region, which adds 
polyadenylate nucleotides to the 3' end of the RNA. Transcription of DNA into mRNA is 
regulated by a region of DNA usually referred to as the promoter. The promoter region 
contains a sequence of bases that signals RNA polymerase to associate with the DNA and 

25 to initiate the transcription of mRNA using one of the DNA strands as a template to make 
a corresponding complimentary strand of RNA. This RNA is then used as a template for 
. the production of the protein encoded therein by the cells protein biosynthetic machinery. 
A number of promoters that are active in plant cells have been described in the 
literature. These include the nopaline synthase (NOS) and octopine synthase (OCS) 

30 promoters (which are carried on tumor-inducing plasmids of Agrobacterium tumefaciens), 
the caulimovirus promoters such as the cauliflower mosaic virus (CaM\Q 1 9S and 35S and 
the figwort mosaic virus (FMV) 35S-promoters, the light-inducible promoter from the 
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small subunit of ribulose-l,5-bisphosphate carboxylase (ssRUBISCO), a very abundant 
plant polypeptide, and the chlorophyll a/b binding protein gene promoters, etc. All of 
these promoters have been used to create various types of DNA constructs that have been 
expressed in plants; see, e.g., PCT publication WO 84/02913. 
5 Promoters that are known to or are found to cause transcription of DNA in plant 

cells can be used in the present invention. Such promoters may be obtained from a variety 
of sources such as plants and plant viruses and include, but are not limited to, the enhanced 
CaMV35S promoter and promoters isolated from plant genes such as ssRUBlSCO genes. 
As described below, it is preferred that the particular promoter selected should be capable 

10 of causing sufficient expression to result in the production of an effective amount of 
fructose 1,6 bisphosphate aldolase enzyme to cause the desired increase in carbon 
assimilation, export or storage. Expression of the double-stranded DNA molecules of the 
present invention can be driven by a constitutive promoter, expressing the DNA molecule 
in all or most of the tissues of the plant. Alternatively, it may be preferred to cause 

15 expression of thcfda gene in specific tissues of the plant, such as leaf, stem, root, tuber, 
seed, fruit, etc. The promoter chosen will have the desired tissue and developmental 
specificity. Those skilled in the art will recognize that the amount of fructose 1,6 
bisphosphate aldolase needed to induce the desired increase in carbon assimilation, export, 
or storage may vary with the type of plant. Therefore, promoter function should be 

20 optimized by selecting a promoter with the desired tissue expression capabilities and 
approximate promoter strength and selecting a transformant that produces the desired 
fructose 1,6 bisphosphate aldolase activity or the desired change in metabolism of 
carbohydrates in the target tissues. This selection approach from the pool of transformants 
is routinely employed in expression of heterologous structural genes in plants because 

25 there is variation between transformants containing the same heterologous gene due to the 
site of gene insertion within the plant genome (commonly referred to as "position effect"). 
In addition to promoters that are known to cause transcription (constitutively or tissue- 
specific) of DNA in plant cells, other promoters may be identified for use in the current 
invention by screening a plant cDNA library for genes that are selectively or preferably 

30 expressed in the target tissues of interest and then isolating the promoter regions by 
methods known in the art. In particular, it may be desirable to use a bundle sheath cell 
specific (or cell enhanced expression) promoter for use with C4 plants such as corn, 
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sorghum, and sugarcane to obtain the yield benefits of overexpression of FDA and not use 
a constitutive promoter or a promoter with mesophyll cell enhanced expression properties. 

For the purpose of expressing the fda gene in source tissues of the plant, such as 
the leaf or stem, it is preferred that the promoters utilized in the double-stranded DNA 

5 molecules of the present invention have relatively high expression in these specific tissues. 
For this purpose, one may also choose from a number of promoters for genes with leaf- 
specific or leaf-enhanced expression. Examples of such genes known from the literature 
are the chloroplast glutamine synthetase GS2 from pea (Edwards et al., 1990), the 
chloroplast fructose- 1,6-bisphosphatase (FBPase) from wheat (Lloyd et al., 1991), the 

10 nuclear photosynthetic ST-LS1 from potato (Stockhaus et al., 1989), and the phenylalanine 
ammonia-lyase (PAL) and chalcone synthase (CHS) genes from Arabidopsis thaliana 
(Leyva et al., 1995). Also shown to be active in photosynthetically active tissues are the 
ribulose-l,5-bisphosphate carboxylase (RUBISCO), isolated from eastern larch (Larix 
laricina) (Campbell et al., 1994); the cab gene, encoding the chlorophyll a/b-binding 

15 protein of PSII, isolated from pine (cab6; Yamamoto et al„ 1994), wheat (Cab-1; Fejes et 
al., 1990), spinach (CAB-1; Luebberstedt et ah, 1994), and rice (cablR: Luan et al., 1992); 
the pyruvate orthophosphate dikinase (PPDK) from maize (Matsuoka et al., 1993); the 
tobacco Lhcbl*2 gene (Cerdan et al., 1997); the Arabidopsis thaliana SUC2 sucrose-H+ 
symporter gene (Truernit et al., 1995); and the thylacoid membrane proteins, isolated from 

20 spinach (psaD, psaF, psaE, PC, FNR, atpC, atpD, cab, rbcS; Oelmueller et al., 1992). 
Other chlorophyll a/b-binding proteins have been studied and described in the literature, 
such as LhcB and PsbP from white mustard (Sinapis alba; Kretsch et al., 1995). 
Homologous promoters to those described here may also be isolated from and tested in the 
target or related crop plant by standard molecular biology procedures. 

25 For the purpose of expressing the fda in sink tissues of the plant, for example the 

tuber of the potato plant; the fruit of tomato; or seed of maize, wheat, rice, or barley, it is 
preferred that the promoters utilized in the double-stranded DNA molecules of the present 
invention have relatively high expression in these specific tissues. A number of genes with 
tuber-specific or tuber-enhanced expression are known, including the class I patatin 

30 promoter (Bevan et al., 1 986; Jefferson et al., 1990); the potato tuber ADPGPP genes, both 
the large and small subunits (Muller et al., 1990); sucrose synthase (Salanpubat and 
Belliard, 1987, 1989); the major tuber proteins including the 22 kDa protein complexes 
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and proteinase inhibitors (Hannapel, 1990); the granule bound starch synthase gene 
(GBSS) (Rohde et al., 1990); and the other class I and II patatins (Rocha-Sosa et al, 1989; 
Mignery et al., 1988). Other promoters can also be used to express a fructose 1,6 
bisphosphate aldolase gene in specific tissues, such as seeds or fruits. The promoter for fl- 
5 conglycinin (Tierney, 1987) or other seed-specific promoters, such as the napin and 
phaseolin promoters, can be used to over-express an fda gene specifically in seeds. The 
zeins are a group of storage proteins found in maize endosperm. Genomic clones for zein 
genes have been isolated (Pedersen et al., 1982), and the promoters from these clones, 
including the 15 kDa, 16 kDa, 19 kDa, 22 kDa, 27 kDa, and gamma genes, could also be 

10 used to express an fda gene in the seeds of maize and other plants. Other promoters 

known to function in maize, wheat, or rice include the promoters for the following genes: 
waxy, Brittle, Shrunken 2, branching enzymes I and II, starch synthases, debranching 
enzymes, oleosins, glutelins, and sucrose synthases. Particularly preferred promoters for 
maize endosperm expression, as well as in wheat and rice, of an fda gene is the promoter 

15 for a glutelin gene from rice, more particularly the Osgt-1 promoter (Zheng et al., 1993); 
the maize granule-bound starch synthase (waxy) gene (zmGBS); the rice small subunit 
ADPGPP promoter (osAGP) ;and the zein promoters, particularly the maize 27 kDa zein 
gene promoter (zm27) (see. generally, Russell et al., 1997). Examples of promoters 
suitable for expression of an fda gene in wheat include those for the genes for the 

20 ADPglucose pyrophosphorylase (ADPGPP) subunits, for the granule bound and other 
starch synthases, for the branching and debranching enzymes, for the embryogenesis- 
abundant proteins, for the gliadins, and for the glutenins. Examples of such promoters in 
rice include those for the genes for the ADPGPP subunits, for the granule bound and other 
starch synthases, for the branching enzymes, for the debranching enzymes, for sucrose 

25 synthases, and for the glutelins. A particularly preferred promoter is the promoter for rice 
glutelin, Osgt-f . Examples of such promoters for barley include those for the genes for the 
. ADPGPP subunits, for the granule bound and other starch synthases, for the branching 
enzymes, for the debranching enzymes, for sucrose synthases, for the hordeins, for the 
embryo globulins, and for the aleurone-specific proteins. 

30 The solids content of root tissue may be increased by expressing an fda gene 

behind a root-specific promoter. An example of such a promoter is the promoter from the 
acid chitinase gene (Samac et al., 1990). Expression in root tissue could also be 

10 
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accomplished by utilizing the root-specific subdomains of the CaMV35S promoter that 
have been identified (Benfey et al., 1989). 

The RNA produced by a DNA construct of the present invention may also contain 
a 5' non-translated leader sequence. This sequence can be derived from the promoter 
5 selected to express the gene and can be specifically modified so as to increase translation 
of the mRNA. The 5' non-translated regions can also be obtained from viral RNAs, from 
suitable eukaryotic genes, or from a synthetic gene sequence. The present invention is not 
limited to constructs, as presented in the following examples, wherein the non-translated 
region is derived from the 5' non-translated sequence that accompanies the promoter 
10 sequence. Rather, the non-translated leader sequence can be derived from an unrelated 
promoter or coding sequence. 

In monocots, an intron is preferably included in the gene construct to facilitate or 
enhance expression of the coding sequence. Examples of suitable introns include the 
HSP70 intron and the rice actin intron, both of which are known in the art. Another 
15 suitable intron is the castor bean catalase intron (Suzuki et al., 1994) 
Polvadenvlation signal 

The 3 ! non-translated region of the chimeric plant gene contains a polyadenylation 
signal that functions in plants to cause the addition of polyadenylate nucleotides to the 
3' end of the RNA. Examples of suitable 3' regions are (1) the 3' transcribed, non- 
20 translated regions containing the polyadenylation signal of Agrobacterium tumor-inducing 
(Ti) plasmid genes, such as the nopaline synthase (NOS) gene, and (2) plant genes like the 
soybean storage protein genes and the small subunit of the ribulose-l,5-bisphosphate 
carboxylase (ssRUBISCO) gene. 

Plastid-directed expression of fructose- 1.6-bisphosphate aldolase activity 
25 In one embodiment of the invention, the fda gene may be fused to a chloroplast 

transit peptide, in order to target the FDA protein to the plastid. As used hereinafter, 
. chloroplast and plastid are intended to include the various forms of plastids including 
amylo'plasts. Many plastid-localized proteins are expressed from nuclear genes as 
precursors and are targeted to the plastid by a chloroplast transit peptide (CTP), which is 
30 removed during the import steps. Examples of such chloroplast proteins include the small 
subunit of ribulose-l,5-biphosphate carboxylase (ssRUBISCO, SSU), 5- 
enolpyruvateshikimate-3 -phosphate synthase (EPSPS), ferredoxin, ferredoxin 
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oxidoreductase, the light-harvesting-complex protein I and protein II, and thioredoxin F. It 
has been demonstrated that non-plastid proteins may be targeted to the chloroplast by use 
of protein fiisions with a CTP and that a CTP sequence is sufficient to target a protein to 
the plastid. Those skilled in the art will also recognize that various other chimeric 
constructs can be made that utilize the functionality of a particular plastid transit peptide to 
import the fructose- 1,6-diphosphate aldolase enzyme into the plant cell plastid. The fda 
gene could also be targeted to the plastid by transformation of the gene into the chloroplast 
genome (Daniell et al, 1998). 
Fructose 1.6 bisphosphate aldolases 

As used herein, the term "fructose 1, 6-bisphophate aldolase" means an enzyme 
(E.C. 4.1.2.13) that catalyzes the reversible cleavage of fructose 1,6-bisphosphate to form 
glyceraldehyde 3-phosphate (G3P) and dihydroxyacetone phosphate (DHAP). Aldolase 
enzymes are divided into two classes, designated class I and class II (Witke and Gotz, 
1993). Various fda genes encoding the enzyme have been sequenced, as have numerous 
proteins, such as the cytosolic enzyme from maize (GenBank Accession S07789;S 10638), 
cytosolic enzyme from rice (GenBank Accession JQ0543), cytosolic enzyme from spinach 
(GenBank Accession S3 1091 ;S22093), from Arabidopsis thaliana (GenBank Accession 
SI 1958), from spinach chloroplast (GenBank Accession S3 1090;A21815;S22092), from 
yeast (S. cerevisiae) (GenBank Accession S07855; S37882; S12945; S39178; 
S44523;X75781), from Rhodobacter sphaeroides (GenBank Accession B40767;D41080), 
from B. subtilis (GenBank Accession S55426; D32354; E32354; D41 835), from garden 
pea (GenBank Accession S29048; S3441 1), from garden pea chloroplast (GenBank 
Accession S29047; S34410), from maize (GenBank Accession S05019), from 
Chlamydomonas reinhardtii (GenBank Accession S48639; S58485; S58486; S34367), 
from Corynebacterium glutamicum (GenBank Accession S09283; XI 73 13), from 
Campylobacter jejuni (GenBank Accession S52413), from Haemophilus influenzae (strain 
. Rd KW20) (GenBank Accession C64074), from Streptococcus pneumonia (GenBank 
Accession AJ005697), from rice (GenBank Accession X53130), and from the maize 
anaerobically regulated gene (GenBank Accession XI 2872). 

The class I enzymes may be isolated from higher eukaryotes, such as animals and 
plants, and in some prokaryotes, including Peptococcus aerogens, (Lebherz and Rutter, 
1973), Lactobacillus casei (London and Kline, 1973), Escherichia coli (Stribling and 



12 



WO 98/58069 



PCT/US98/12447 



Perham, 1973), Mycobacterium smegmatis (Bai et al., 1975), and most staphylococcal 
species (Gotz et al., 1979). The gene for the FDA enzyme may be obtained by known 
methods and has already been done so for several organisms, such as rabbit (Lai et al., 
1974), human (Besmond et al., 1983), rat (Tsutsumi et al., 1984), Trypanosoma brucei 

5 (Clayton, 1985), mdArabidopsis thaliana (Chopra et al., 1990). These class I enzymes 
are invariably tetrameric proteins with a total molecular weight of about 160 kDa and 
function by imine formation between the substrate and a lysine residue in the active site 
(Alfounder et al., 1989). 

In animal, three class I isozymes, classified as A, B, and C, are expressed in the 

10 cytosol of muscle, liver, and brain tissue respectively, and they differ from plant aldolases 
in their expression and compartmentation patterns (Joh et al., 1986). In the leaves of 
higher plants, FDA is a class I enzyme, and two different isoenzymes within the class have 
been documented. One is contained in the chloroplast and the other in the cytosol 
(Lebherz et al., 1984). The acidic plant isozyme appear to be chloroplastic and comprises 

15 about 85% of the total leaf aldolase activity. The basic plant isozyme is cytosolic, and 
both isozymes appear to be encoded by the nuclear genome and are encoded by different 
genes (Lebherz et al., 1984). 

The class II type aldolases are normally dimeric with molecular mass of 
approximately 80 kDa, and their activity depends on divalent metal ions. The class II 

20 enzymes may be isolated from prokaryotes, such as blue-green algae and bacteria, and 
eukaryotic green algae and fungi (Baldwin et al., 1978). The gene for the FDA class II 
enzyme may be obtained by known methods and has already been done so from several 
organisms including Saccharomyces cerevisiae (Jack and Harris, 1971), Bacillus 
stearothermophilus (Jack, 1973), and Escherichia coli (Baldwin et al., 1978). 

25 It is believed that highly homologous class II fructose 1, 6-bisphophate aldolases 

with similar catalyzing activity will also be found in other species of microorganism, such 
. as Saccharomyces {Saccharomyces cerevisiae), Bacillus {Bacillus subtilis); Rhodobacter 
{Rhodobacter sphaeroides); Plasmodium {Plasmodium falciparium, Plasmodium berghei); 
Trypanosoma {Trypanosoma brucei); Chlamydomonas {Chlamydomas reinhardtii); 

30 Candida {Candida albicans); Corynebacterium {Corynebacterium glutamicum); 

Campylobacter {Campylobacter jejuni); and Haemophilus {Haemophilu&nfluenza). 
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Such sequences can be readily isolated by methods well known in the art, for 
example by nucleic acid hybridization. The hybridization properties of a given pair of 
nucleic acids are an indication of their similarity or identity. Nucleic acid sequences can 
be selected on the basis of their ability to hybridize with known fda sequences. Low 
5 stringency conditions may be used to select sequences with less homology or identity. 
One may wish to employ conditions such as about 0.15 M to about 0.9 M sodium chloride, 
at temperatures ranging from about 20°C to about 55°C. High stringency conditions may 
be used to select for nucleic acid sequences with higher degrees of identity to the disclosed 
sequences. Conditions typically employed may include about 0.02 M to about 0.15 M 

10 sodium chloride, about 0.5% to about 5% casein, about 0.02% SDS or about 0.1% N- 
laurylsarcosine, about 0.001 M to about 0.03 M sodium citrate, at hybridization 
temperatures between about 50°C and about 70°C. More preferably, high stringency 
conditions are about 0.02 M sodium chloride, about 0.5% casein, about 0.02% SDS, about 
0.001 M sodium citrate, at a temperature of about 50°C. The skilled individual will 

15 recognize that numerous variations are possible in the conditions and means by which 
nucleic acid hybridization can be performed to isolate fda sequences having similarity to 
fda sequences known in the art and are not limited to those explicitly disclosed herein. 
Preferably, such an approach is used to isolate fda sequences having greater than about 
60% identity with the disclosed Exolifda sequence, more preferably greater than about 

20 70% identity, most preferably greater than about 80% identity. 

Depending on growth conditions Euglena gracilis, Chlamydomonas mundana, and 
Chlamydomomas rheinhardi produce either a class I or a class II aldolase (Cremona, 1 968; 
Russell and Gibbs, 1967; Guerrini et al., 1971). 

The isolation of a class 11 fda gene froiru£._cc?// is described in the following 

25 examples. Its DNA sequence is given as SEQ ID NO:l and shown in Figure 1 . The 

amino acid sequence is shown in SEQ ID NO:2 and shown in Figure 1 . This gene can be 
.used as isolated by inserting it into plant expression vectors suitable for the transformation 
method of choice as described. The E. coli FDA enzyme has an apparent pH optimum 
range near pH 7-9 and retains activity in the lower pH range of 5-7 (Baldwin et al., 1978; 

30 Alfounder et al., 1989)." 

Thus, many different genes that encode a fructose 1,6 bisphosphate aldolase 
activity may be isolated and used in the present invention. 



WO 98/58069 



PCT/US98/12447 



Synthetic gene construction 

A carbohydrate metabolizing enzyme considered in this invention includes any 
sequence of amino acids, such as protein, polypeptide, or peptide fragment, that 
demonstrates the ability to catalyze a reaction involved in the synthesis or degradation of 

5 starch or sucrose. These can be sequences obtained from a heterologous source, such as 
algae, bacteria, fungi, and protozoa, or endogenous plant sequences, by which is meant 
any sequence that can be naturally found in a plant cell, including native (indigenous) 
plant sequences as well as sequences from plant viruses or plant pathogenic bacteria. 
It will be recognized by one of ordinary skill in the art that carbohydrate 

10 metabolizing enzyme gene sequences may also be modified using standard techniques 
such as site-specific mutation or PCR, or modification of the sequence may be 
accomplished by producing a synthetic nucleic acid sequence and will still be considered a 
carbohydrate biosynthesis enzyme nucleic acid sequence of this invention. For example, 
"wobble" positions in codons may be changed such that the nucleic acid sequence encodes 

15 the same amino acid sequence, or alternatively, codons can be altered such that 
conservative amino acid substitutions result. In either case, the peptide or protein 
maintains the desired enzymatic activity and is thus considered part of this invention. 

A nucleic acid sequence to a carbohydrate metabolizing enzyme may be a DNA or 
RNA sequence, derived from genomic DNA, cDNA, mRNA, or may be synthesized in 

20 whole or in part. The structural gene sequences may be cloned, for example, by isolating 
genomic DNA from an appropriate source and amplifying and cloning the sequence of 
interest using a polymerase chain reaction (PCR). Alternatively, the gene sequences may 
be synthesized, either completely or in part, especially where it is desirable to provide 
plant-preferred sequences. Thus, all or a portion of the desired structural gene may be 

25 synthesized using codons preferred by a selected plant host. Plant-preferred codons may 
be determined, "for example, from the codons used most frequently in the proteins 
. expressed in a particular plant host species. Other modifications of the gene sequences 
may result in mutants having slightly altered activity. 

If desired, the gene sequence of the fda gene can be changed without changing the 

30 protein sequence in such a manner as may increase expression and thus even more 
positively affect carbohydrate content in transformed plants. A preferred,manner for 
making the changes in the gene sequence is set out in PCT Publication WO 90/10076. A 
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gene synthesized by following the methodology set out therein may be introduced into 
plants as described below and result in higher levels of expression of the FDA enzyme. 
This may be particularly useful in monocots such as maize, rice, wheat, sugarcane, and 
barley. 

5 Combinations with other transgenes 

The effect of fda in transgenic plants may be enhanced by combining it with other 
genes that positively affect carbohydrate assimilation or content, such as a gene encoding 
for a sucrose phosphorylase as described in PCT Publication WO 96/24679, or ADPGPP 
genes such as the E. coli glgC gene and its mutant glgC\6. PCT Publication WO 

10 91/1 9806 discloses how to incorporate the latter gene into many plant species in order to 
increase starch or solids. Another gene that can be combined with fda to increase carbon 
assimilation, export or storage is a gene encoding for sucrose phosphate synthase (SPS), 
PCT Publication WO 92/16631 discloses one such gene and its use in transgenic plants. 
Plant transformation/regeneration 

15 In developing the nucleic acid constructs of this invention, the various components 

of the construct or fragments thereof will normally be inserted into a convenient cloning 
vector, e.g., a plasmid that is capable of replication in a bacterial host, e.g., E. coli. 
Numerous vectors exist that have been described in the literature, many of which are 
commercially available. After each cloning, the cloning vector with the desired insert may 

20 be isolated and subjected to further manipulation, such as restriction digestion, insertion of 
new fragments or nucleotides, ligation, deletion, mutation, resection, etc. so as to tailor the 
components of the desired sequence. Once the construct has been completed, it may then 
be transferred to an appropriate vector for further manipulation in accordance with the 
manner of transformation of the host cell. 

25 A recombinant DNA molecule of the invention typically includes a selectable 

marker so that transformed cells can be easily identified and selected from non- 
transformed cells. Examples of such include, but are not limited to, a neomycin 
phosphotransferase (nptll) gene (Potrykus et al., 1985), which confers kanamycin 
resistance. Cells expressing the nptll gene can be selected using an appropriate antibiotic 

30 such as kanamycin or G41 8. Other commonly used selectable markers include the bar 
gene, which confers bialaphos resistance; a mutant EPSP synthase gene (IJinchee et al., 
1988), which confers glyphosate resistance; a nitrilase gene, which confers resistance to 
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bromoxynil (Stalker et al., 1988); a mutant acetolactate synthase gene (ALS), which 
confers imidazolinone or sulphonylurea resistance (European Patent Application 1 54,204, 
1985); and a methotrexate resistant DHFR gene (Thillet et al., 1988). 

Plants that can be made to have enhanced carbon assimilation, increased carbon 

5 export and partitioning by practice of the present invention include, but are not limited to, 
Acacia, alfalfa, aneth, apple, apricot, artichoke, arugula, asparagus, avocado, banana, 
barley, beans, beet, blackberry, blueberry, broccoli, brussels sprouts, cabbage, canola, 
cantaloupe, carrot, cassava, cauliflower, celery, cherry, cilantro, citrus, Clementines, 
coffee, corn, cotton, cucumber, Douglas fir, eggplant, endive, escarole, eucalyptus, fennel, 

10 figs, gourd, grape, grapefruit, honey dew, jicama, kiwifruit, lettuce, leeks, lemon, lime, 
Loblolly pine, mango, melon, mushroom, nut, oat, oil seed rape, okra, onion, orange, an 
ornamental plant, papaya, parsley, pea, peach, peanut pear, pepper, persimmon, pine, 
pineapple, plantain, plum, pomegranate, poplar, potato, pumpkin, quince, radiata pine, 
radicchio, radish, raspberry, rice, rye, sorghum, Southern pine, soybean, spinach, squash, 

15 strawberry, sugarbeet, sugarcane, sunflower, sweet potato, sweetgum, tangerine, tea, 
tobacco, tomato, triticale, turf, a vine, watermelon, wheat, yams, and zucchini. 

A double-stranded DNA molecule of the present invention containing an fda gene 
can be inserted into the genome of a plant by any suitable method. Suitable plant 
transformation vectors include those derived from a Ti plasmid of Agrobacterium 

20 tumefaciens, as well as those disclosed, e.g., by Herrera-Estrella et al. (1983), Bevan 
(1984), Kleeetal. (1985) and EPO publication 120,516. In addition to plant 
transformation vectors derived from the Ti or root-inducing (Ri) plasmids of 
Agrobacterium, alternative methods can be used to insert the DNA constructs of this 
invention into plant cells. Such methods may involve, for example, the use of liposomes, 

25 electroporation, chemicals that increase free DNA uptake, free DNA delivery via 

microprojectile bombardment, and transformation using viruses or pollen. DNA may also 
' be inserted into the chloroplast genome (Daniell et al., 1998). 

A plasmid expression vector suitable for the introduction of an fda gene in 
monocots using microprojectile bombardment is composed of the following: a promoter 

30 that is specific or enhanced for expression in the starch storage tissues in monocots, 
generally the endosperm, such as promoters for the zein genes found in the maize 
endosperm (Pedersen et al., 1982); an intron that provides a splice site to facilitate 
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expression of the gene, such as the Hsp70 intron (PCT Publication W093/19189); and a 3' 
polyadenylation sequence such as the nopaline synthase 3 1 sequence (NOS 3'; Fraley et al, 
1983). This expression cassette may be assembled on high copy replicons suitable for the 
production of large quantities of DNA. 

A particularly useful Agrobacterium-bassd plant transformation vector for use in 
transformation of dicotyledonous plants is plasmid vector pMON530 (Rogers et al., 1987). 
Plasmid pMON530 is a derivative of pMON505 prepared by transferring the 2.3 kb StuI- 
Hindlll fragment of pMON3 1 6 (Rogers et aL, 1 987) into pMON526. Plasmid pMON526 
is a simple derivative of pMON505 in which the Smal site is removed by digestion with 
Xmal, treatment with Klenow polymerase and ligation. Plasmid pMON530 retains all the 
properties of pMON505 and the CaMV35S-NOS expression cassette and now contains a 
unique cleavage site for Smal between the promoter and polyadenylation signal. 

Binary vector pMON505 is a derivative of pMON200 (Rogers et al., 1987) in 
which the Ti plasmid homology region, LIH, has been replaced with a 3.8 kb Hindlll to 
Smal segment of the mini RK2 plasmid, pTJS75 (Schmidhauser and Helinski, 1985). This 
segment contains the RK2 origin of replication, oriV, and the origin of transfer, oriT, for 
conjugation into Agrobactehum using the tri-parental mating procedure (Horsch and Klee, 
1986). Plasmid pMON505 retains all the important features of pMON200 including the 
synthetic multi-linker for insertion of desired DNA fragments, the chimeric 
NOS/NPTIF/NOS gene for kanamycin resistance in plant cells, the 
spectinomycin/streptomycin resistance determinant for selection in E, coli and 
A. tumefaciens, an intact nopaline synthase gene for facile scoring of transformants and 
inheritance in progeny, and a pBR322 origin of replication for ease in making large 
amounts of the vector in E. coli. Plasmid pMON505 contains a single T-DNA border 
derived from the right end of the pTiT37 nopaline-type T-DNA. Southern blot analyses 
have shown that plasmid pMON505 and any DNA that it carries are integrated into the 
. plant genome, that is, the entire plasmid is the T-DNA that is inserted into the plant 
genome. One end of the integrated DNA is located between the right border sequence and 
the nopaline synthase gene and the other end is between the border sequence and the 
pBR322 sequences. 

Another particularly useful Ti plasmid cassette vector is pMON17227. This vector 
is described in PCT Publication WO 92/04449 and contains a gene encoding an enzyme 
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conferring glyphosate resistance (denominated CP4), which is an excellent selection 
marker gene for many plants, including potato and tomato. The gene is fused to the 
Arabidopsis EPSPS chloroplast transit peptide (CTP2) and expressed from the FMV 
promoter as described therein. 

5 When adequate numbers of cells (or protoplasts) containing the fda gene or cDNA 

are obtained, the cells (or protoplasts) are regenerated into whole plants. Choice of 
methodology for the regeneration step is not critical, with suitable protocols being 
available for hosts from Leguminosae (alfalfa, soybean, clover, etc.), Umbelliferae (carrot, 
celery, parsnip), Cruciferae (cabbage, radish, canola/rapeseed, etc.), Cucurbitaceae 

10 (melons and cucumber), Gramineae (wheat, barley, rice, maize, etc.), Solanaceae (potato, 
tobacco, tomato, peppers), various floral crops, such as sunflower, and nut-bearing trees, 
such as almonds, cashews, walnuts, and pecans. See, e.g., Ammirato et al. (1984); 
Shimamoto et al. (1989); Fromm (1990); Vasil et al. (1990); Vasil et al. (1992); 
Hayashimoto (1990); and Datta et al. (1990). 

15 The following definitions are provided in order to aid those skilled in the art in 

understanding the detailed description of the present invention. 

The term "promoter" or "promoter region" refers to a nucleic acid sequence, 
usually found upstream (5') to a coding sequence, that controls expression of the coding 
sequence by controlling production of messenger RNA (mRNA) by providing the 

20 recognition site for RNA polymerase or other factors necessary for start of transcription at 
the correct site. As contemplated herein, a promoter or promoter region includes 
variations of promoters derived by means of ligation to various regulatory sequences, 
random or controlled mutagenesis, and addition or duplication of enhancer sequences. 
The promoter region disclosed herein, and biologically functional equivalents thereof, are 

25 responsible for driving the transcription of coding sequences under their control when 
introduced into a host as part of a suitable recombinant vector, as demonstrated by its 
.ability to produce mRNA. 

"Regeneration" refers to the process of growing a plant from a plant cell (e.g., plant 
protoplast or explant). 

30 "Transformation" refers to a process of introducing an exogenous nucleic acid 

sequence (e.g., a vector, recombinant nucleic acid molecule) into a cell or.protoplast in 
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which that exogenous nucleic acid is incorporated into a chromosome or is capable of 
autonomous replication. 

A "transformed cell" is a cell whose DNA has been altered by the introduction of 
an exogenous nucleic acid molecule into that cell. 
5 The term "gene" refers to chromosomal DNA, plasmid DNA, cDNA, synthetic 

DNA, or other DNA that encodes a peptide, polypeptide, protein, or RNA molecule, and 
regions flanking the coding sequence involved in the regulation of expression. 

"Identity" refers to the degree of similarity between two nucleic acid or protein 
sequences. An alignment of the two sequences is performed by a suitable computer 
10 program. A widely used and accepted computer program for performing sequence 

alignments is CLUSTALW vl.6 (Thompson et al., 1994). The number of matching bases 
or amino acids is divided by the total number of bases or amino acids and multiplied by 
1 00 to obtain a percent identity. For example, if two 580 base pair sequences had 145 
matched bases, they would be 25 percent identical. If the two compared sequences are of 
15 different lengths, the number of matches is divided by the shorter of the two lengths. For 
example, if there were 100 matched amino acids between 200 and a 400 amino acid 
proteins, they are 50 percent identical with respect to the shorter sequence. If the shorter 
sequence is less than 50 bases or amino acids in length, the number of matches are divided 
by 50 and multiplied by 100 to obtain a percent identity. 
20 "C-terminal region" refers to the region of a peptide, polypeptide, or protein chain 

from the middle thereof to the end that carries the amino acid having a free carboxyl 
group. 

The phrase "DNA segment heterologous to the promoter region" means that the 
coding DNA segment does not exist in nature in the same gene with the promoter to which 
25 it is now attached. 

The term "encoding DNA" refers to chromosomal DNA, plasmid DNA, cDNA, or 
.synthetic DNA that encodes any of the enzymes discussed herein. 

' The term "genome" as it applies to bacteria encompasses both the chromosome and 
plasmids within a bacterial host cell. Encoding DNAs of the present invention introduced 
30 into bacterial host cells can therefore be either chromosomally integrated or plasmid- 
localized. The term "genome" as it applies to plant cells encompasses noj only 
chromosomal DNA found within the nucleus, but organelle DNA found within subcellular 

20 



WO 98/58069 



PCTAJS98/12447 



components of the cell. DNAs of the present invention introduced into plant cells can 
therefore be either chromosomally integrated or organelle-localized. 

The terms "microbe" or "microorganism" refer to algae, bacteria, fungi, and 
protozoa. 

5 The term "mutein" refers to a mutant form of a peptide, polypeptide, or protein. 

"N-terminal region" refers to the region of a peptide, polypeptide, or protein chain 
from the amino acid having a free amino group to the middle of the chain. 

"Overexpression" refers to the expression of a polypeptide or protein encoded by a 
DNA introduced into a host cell, wherein said polypeptide or protein is either not normally 
10 present in the host cell, or wherein said polypeptide or protein is present in said host cell at 
a higher level than that normally expressed from the endogenous gene encoding said 
polypeptide or protein. 

The term "plastid" refers to the class of plant cell organelles that includes 
amyloplasts, chloroplasts, chromoplasts, elaioplasts, eoplasts, etioplasts, leucoplasts, and 
15 proplastids. These organelles are self-replicating and contain what is commonly referred 
to as the "chloroplast genome," a circular DNA molecule that ranges in size from about 
120 kb to about 217 kb, depending upon the plant species, and which usually contains an 
inverted repeat region. 

The phrase "simple carbohydrate substrate" means a monosaccharide or an 
20 oligosaccharide but not a polysaccharide; simple carbohydrate substrate includes glucose, 
fructose, sucrose, lactose. More complex carbohydrate substrates commonly used in 
media such as corn syrup, starch, and molasses can be broken down to simple 
carbohydrate substrates. 

The term "solids" refers to the nonaqueous component of a tuber (such as in 
25 potato) or a fruit (such as in tomato) comprised mostly of starch and other 

polysaccharides, simple carbohydrates, nonstructural carbohydrated, amino acids, and 
.other organic molecules. 

The following examples are provided to better elucidate the practice of the present 
invention and should not be interpreted in any way to limit the scope of the present 
30 invention. Those skilled in the art will recognize that various modifications, truncations, 
etc., can be made to the methods and genes described herein while not departing from the 
spirit and scope of the present invention. 
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EXAMPLES 

EXAMPLE 1 

cDNA cloning and overexpression 

Unless otherwise stated, basic DNA manipulations and genetic techniques, such as 
5 PCR, agarose electrophoresis, restriction digests, ligations, £ coli transformations, colony 
screens, and Western blots were performed essentially by the protocols described in 
Sambrook et aL (1989) or Maniatis et ah (1982). 

The £ colifda gene sequence (SEQ ID NO: 1) was obtained from Genbank 
(Accession Number XI 4682) and nucleotide primers with homology to the 5' and 3' end 

10 were designed for PCR amplification. £ coli chromosomal DNA was extracted and the £ 
colifda gene was amplified by PCR using the 5' oligonucleotide 
5 , GGGGCCATGGCTAAGATTTTTGATTTCGTA3' (SEQIDNO:3) and the 3' 
oligonucleotide 5'CCCCGAGCTCTTACAGAACGTCGATCGCGTTCAG3' (SEQ ID 
NO:4). The PCR cycling conditions were as follows: 94°C, 5 min (1 cycle); addition of 

15 polymerase; 94°C, 1 min., 60°C, 1 min., 72 6 C, 2 min.30 sec. (35 cycles). The 1 .08 kb PCR 
product was gel purified and ligated into an E.coli expression vector, pMON5723, to form 
a vector construct that was used for transformation of frozen competent E.coli JM101 
cells. The pMON5723 vector contains the E.coli recA promoter and the T7 gene 10 leader 
(G10L) sequences, which enable high level expression in E.coli (Wong et al., 1988). After 

20 induction of the transformed cells, a distinct protein band of about 40 kDa was apparent on 
an SDS PAGE gel, which correlates with the size of the subunit polypeptide chain of the 
dimeric aldolase II. It was shown that most of the induced protein was present in the 
soluble phase. A gel slice containing the highly induced protein was isolated and 
antibodies were produced in a goat, which was injected with the homogenized gel slice 

25 (emulsified in Freund's complete adjuvant). 

The fda gene sequence was subsequently cloned into another E.coli expression 
. vector, under the control of the taq promoter. Induction with IPTG of JM101 cells 
transformed with this vector showed the same 40 kDa overexpressed protein band. This 
new clone was used in an enzyme assay for FDA activity. Cells transformed with this 

30 vector construct were grown in a liquid culture, induced with IPTG, and grown for another 
3 hours. Subsequently, a 3 mL cell culture was spun down, dissolved in LOOmM Tris and 
sonicated. The cell pellet was spun down, and the crude cell extract supernatant was 
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assayed for FDA activity, using a coupled enzymatic assay as described by Baldwin et al. 
(1978). This assay was routinely performed at 30°C. 

The reaction was performed in a 1 mL final volume in excess presence of the 
enzymes triosephosphate isomerase (TIM) and alpha-glycerophosphate dehydrogenase 
5 (GDH) in a reaction mixture containing final concentrations of lOOmM Tris pH 8.0, 4.75 
mM fructose 1,6 bisphosphate, 0.15 mM NADH, 500 U/mL TIM, and 30 U/mL GDH. 

The decrease in absorbance at 340nm, after addition of the cell extract supernatant, 
was recorded using an HP diode array spectrophotometer. One international unit (I.U.) of 
aldolase activity is that causing the oxidation of 2 nmol of NADH/min in this assay 
10 system. 

Cell extracts containing the vector with the fda sequence showed a substantial 
increase in aldolase activity (13.1 I.U./mg protein) as compared to cells transformed with 
the control vector (0.15 I.U./mg protein). The activity was shown to be inhibited by 
EDTA, known to specifically inhibit class II aldolases. 

15 EXAMPLE 2 

Plant transformation and fda expression in tobacco 
Targeting of FDA protein 

Ecoli fructose 1 ,6 bisphosphate aldolase was targeted to the plastid in plants in 
order to assess its influence on carbohydrate metabolism and starch biosynthesis in these 

20 plant organelles. To accomplish the import of the Exoli aldolase into the plastids, a vector 
was constructed in which the aldolase was fused to the Arabidopsis small subunit transit 
peptide (CTP1) (Stark et al., 1 992) or the maize small subunit CTP (Russell et al., 1993), 
creating constructs in which the CTP-fda fusion gene was located between the 35S 
promoter from the figwort mosaic virus (P-FMV35S; Gowda et al., 1989) and the 3'- 

25 nontranslated region of the nopaline synthase gene (NOS 3\ Fraley et al., 1983) 
sequences. The vector construct containing the expression cassette [P- 
FMV/CTPl^fl/NOS3'] was subsequently used for tobacco protoplast transformation, 
which was performed as described in Fromm et al. (1987), with the following 
modifications. Tobacco cultivar Xanthi line D (Txd) cell suspensions were grown in 250- 

30 mL flasks, at 25°C and 138 rpm in the dark. For maintenance, a sub-culture volume of 9 
mL was removed and added to 40 mL of fresh Txd media containing MS §alts, 3% 
sucrose, 0.2 g/L inositol, 0.13 g/L asparagine, 80 ^L of a 50 mg/mL stock of PCPA, 5 ^iL 
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of a 1 mg/mL stock of kinetin, and 1 mL of lOOOx vitamins (1.3 g/L nicotinic acid, 0.25 
g/L thiamine, 0.25 g/L pyridoxine HCL, and 0.25 g/L calcium pantothenate) every 3 to 4 
days. Protoplasts were isolated from 1 -day-old suspension cells that came from a 2-day- 
old culture. Sixteen milliliters of cells were added to 40 mL of fresh Txd media and 
5 allowed to grow 24 hours prior to digestion and isolation of the protoplasts. The 

centrifugation stage for the enzyme mix has been eliminated. The electroporation buffer 
and protoplast isolation media were filter sterilized rather than autoclaved. The 
electroporation buffer did not have 4 mM CaCl2 added. The suspension cells were 
digested in enzyme for 1 hour. Protoplasts were counted on a hemacytometer, counting 
10 only the protoplasts that look intact and circular. Bio-rad Gene Pulser cuvettes (catalog # 
165-2088) with a 0.4-cm gap and a maximum volume of 0.8 mL were used for the 
electroporations. Fifty to 100 |ig of DNA containing the gene of interest along with 5 \ig 
of internal control DNA containing the luciferase gene were added per cuvette. The final 
protoplast density at electroporation was 2xl0 6 /mL and electroporater settings were a 500 
15 piFarad capacitance and 140 volts on the Bio-rad Gene Pulser. Protoplasts were put on ice 
after resuspension in electroporation buffer and remained on ice in cuvettes until 10 
minutes after electroporation. Protoplasts were added to 7 mL of Txd media + 0.4 M 
mannitol and conditioning media after electroporation. At this stage coconut water was no 
longer used. The protoplasts were grown in 1- hour day/night photoperiod regime at 26°C 
20 and were spun down and assayed or frozen 20-24 hours after electroporation. 

Western blot analysis performed on the protoplast extracts, obtained after 
transformation, showed processing into the mature FDA in the tobacco protoplasts. 
Expression was detected of a protein migrating at approximately 40 kDa, which is the 
molecular weight of the aldolase subunit and the size of the protein also observed after 
25 overexpression of the aldolase in £. colL 

The expression cassette [P-FMV/CTPl//da/NOS3'] was subsequently cloned into 
' the NotI site of pMON17227 (described in PCT Publication WO 92/04449), in the same 
orientation as the selectable marker expression cassette, to form the plant transformation 
vector pMON17524, as shown in Figure 2 (SEQ ID NO: 5). 
30 An additional construct was made and used for tobacco protoplast transformation, 

fusing thefda gene to the Arabidopsis EPSPS transit peptide (CTP2), winch is described 
in US patent 5,463,175. The transit peptide was cloned (through the SphI site) into the 
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SphI site located immediately upstream from the N-terminus of the fda gene sequence in 
the CTP1 -fda fusion (described above). This new CTP2-fda fusion gene was then cloned 
into a vector between the FMV promoter and the NOS 3' sequences. When this construct 
containing the CTPllfda gene sequences was used for tobacco protoplast transformation, 

5 expression was detected of a protein migrating at approximately 40 kDa, which is the 
molecular weight of the aldolase subunit and the size of the protein also observed after 
overexpression of the aldolase in E. coll 

The NotI cassette [P-FMV/CTP2//ifo/NOS3'] from this construct was then cloned 
into the NotI site of pMON 17227, in the same orientation as the selectable marker 

10 expression cassette, to form the plant transformation vector pMON 17542, which is shown 
inFigure3(SEQIDNO:6). 

For cytoplasmic expression of the FDA in tobacco protoplasts, a construct was 
made in which the fda gene sequence (without being coupled to a transit peptide) was 
cloned into a vector backbone, between the FMV promoter and the NOS 3' sequences. 

15 Using this construct for tobacco protoplast transformation also showed expression of a 
protein of the same size, migrating at approximately 40 kDa. 
fda expression in tobacco plants 

Two constructs, containing the fda gene, fused to the Arabidopsis small subunit 
CTP1 (pMON17524) (SEQ ID NO:5, Figure 2) and the Arabidopsis EPSPS (CTP2) transit 

20 peptide (pMON 1 7542) (SEQ ID NO:6, Figure 3), were used for tobacco plant 

transformation, as described in US patent 5,463,175. A vector without the CT? -fda 
sequences, pMON 17227 (described in PCT Publication WO 92/04449), was used as a 
negative control. The plant transformation vectors were mobilized into the ABI 
Agrobacterium strain. Mating of the plant vector into the ABI strain was done by the 

25 triparental conjugation system using the helper plasmid pRK201 3 (Ditta et aL, 1 980). 

Growth* chamber-grown tobacco transform ant lines were generated and first 
. screened by Western blot analysis to identify expressors using goat antibody raised against 
£.co//-expressed fda. Subsequently, for pMON17524-expressing tobacco lines, leaf 
nonstructural carbohydrates were analyzed (sucrose, glucose, and hydrolyzed starch into 

30 glucose) by means of a YSI Instrument, Model 2700 Select Biochemistry Analyzer. 

Starting at flowering stage, leaf samples were also taken from these plants and analyzed 
for diurnal changes in leaf nonstructural carbohydrates. 
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Five hundred milligrams to 1 g fresh tobacco leaf tissue samples were harvested 
and extracted in 5 mL of hot Na-phosphate buffer (40 g/L NaH 2 P0 4 and 10 g/L Na 2 H 2 P0 4 
in double de-ionized water) by homogenization with a Polytron. Test tubes were then 
placed in an 85°C water bath for 15 minutes. Tubes were centrifuged for 12 minutes at 

5 3000 rpm and the supernatants saved for soluble sugar analysis. The pellet was 

resuspended in 5 mL of hot Na-phosphate buffer mixed with a Vortex and centrifuged as 
described above. The supernatant was carefully removed and added to the previous 
supernatant fraction for soluble sugar (sucrose and glucose) analysis by YSI using 
appropriate membranes. 

10 The starch was extracted from the pellet using the Megazyme Kit (Megazyme, 

Australia). To the pellet, 200 (iL of 50% ethanol and 3 mL of thermostable alpha-amylase 
(300U) were added and the mixture vortexed. Samples were then incubated in boiling 
water for 6 minutes and stirred after 2 and 4 minutes. Tubes were placed in 50°C water 
bath and 4 mL of 200 mM acetate buffer (pH 4.5) were added followed by 0.1 mL 

15 amyloglucosidase (20 U). Incubation occurred for 1 hour. Test tubes were then centrifuged 
for 15 minutes at 3000 rpm. Aliquots were taken from the supernatant and analyzed for 
glucose by YSI. The free glucose was adjusted to anhydrous glucose (as it occurs in starch 
by multiplying by the ratio 162/182). The total volume per tube was 7.1 mL. 

As seen in Table 1 , expression of the fda gene in tobacco correlated with a 

20 significant increase in leaf starch levels. However, referring to Figure 4, when a diurnal 
profile of starch levels was established in the ^fta-expressing leaves, this increase was 
apparent mainly early in the photoperiod, which is a phase when leaves are known to have 
peak photosynthetic activity. This increase in starch has no apparent negative effect on the 
plant because the increased starch is turned over during the dark period. There was no 

25 apparent increase in steady state levels of sucrose or glucose in tobacco leaves expressing 
E. col i fda as compared to the control. 
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Table 1 

Leaf Carbohydrate Levels of Plants Expressing 
the/afa Transgene 1 (pMON17524) 



High Expressors 
(>0.01% total protein) 



Low Expressors 

(<0.01%) 

(mg/g fresh weight) 



Negative 
Control 



STARCH 35.08 ± 2.84 23.25 + 3.20 

10 SUCROSE 0.97 + 0.17 0.86 ±0.25 

GLUCOSE 1 .88 + 0. 1 7 1.58 + 0.20 



16.69 ±2.92 
0.66 ±0.1 9 
1.68 + 0.26 



1 Leaf samples were harvested at midday. 



A second set of transgenic tobacco plants transformed with the construct 

15 pMON 17542 were grown in the greenhouse. Tobacco plants containing a vector without 
the CTP-fda sequences, pMON 17227, were used as negative control. Of all the 
pMON17542-lines screened for expression by Western blot analysis, 18 were high 
expressors (>0.01% of the total cellular protein) and 15 lines were low expressors 
(<0.01%). Fifteen plants containing the null vector, pMONl 7227, were used as control. 

20 Fully expanded leaves from plants expressing the fda transgene and negative controls were 
tested for sucrose export by collecting phloem exudate from excised leaf systems. The 
phloem exudation technique is described in Groussol et al. (1986). Leaves were harvested 
at 1 1:30 AM and placed in an exudation medium, containing 5 mM EDTA at pH 6.0, and 
allowed to exude for a period of 4 hours under full light and high humidity. The exudation 

25 solution was immediately analyzed for sucrose level, as described above in the 

carbohydrate analysis method. As seen in Table 2, a significant increase in sucrose export 
out of source leaves was observed in plants expressing the fda transgene. 

This increase in sucrose export by ^ifa-expressing leaves is an illustration of an 
increase in source capacity, very likely due to an increased carbon flow through the Calvin 

30 Cycle (in response to increased triose-P utilization) and thus an increase in net carbon 
utilization by the leaf. As seen in Table 2, the increase in sucrose loading in the phloem 
correlates with the level of fda expression. 
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Table 2 

Levels of Sucrose in Phloem Exudate from 
Excised Leaves of fda Transgenic Tobacco Plants (pMON 17542) 
Water uptake sucrose in phloem exudate 

5 (|il/g F.Wt/h) (ng/leaf) (ng/gF.Wt.) 



fda high expressors 320 + 20 330 + 60 

fda low expressors 340 + 1 0 2 1 0 + 1 0 

10 

Control 390 + 30 160+ 10 



Referring to Table 3, preliminary analysis of plant growth and development 
revealed no significant differences in number of leaves or pods per plant, plant height, 

15 stem diameter, or apparent seed weight per plant, between plants expressing the fda gene 
and the vector control under the specific growing and analysis conditions. However, as 
seen in Table 4, the yya-transgenic plants had a significantly higher root mass. This may 
be an indication that, under these conditions, roots represented a more dominant sink that 
attracted excess carbohydrate produced by the source leaves. Furthermore, the present 

20 illustration shows that the increase in root mass in the presence of the Kcolifda gene was 
accomplished with no apparent negative effect on shoot growth, inflorescence, or seed set. 
Therefore, this increase in root growth and final root dry weight is a desirable plant trait 
because it would lead to a rapid seedling establishment following germination and greater 
plant ability to tolerate drought, cold stress, other environmental challenges, and 

25 transplanting. In different plants and under different growing conditions, other plant parts 
(such as seed, fruit, stem, leaf, tuber, bulb, etc.) are expected to show the weight increase 
" observed in tobacco roots overexpressing the fda transgene. 



108 ±22 
77±3 
56 + 3 
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Table 3 

Assessment of Certain Plant Growth and Development Parameters in 
Tobacco Expressing the fda Transgene 1 (pMON 17542) 
#pods/plant #leaves/plant Plant height Seed weight 

5 (cm) (g/plant) 

high expressors 162 + 40 25.4 + 0.8 65.3 + 3.1 18.8 ±2.4 

Control 156 + 28 24.4 + 0.5 65.8 ±5.1 17.3 ±2.6 

1 To achieve this analysis, 14 high-expressor lines were compared to 15 control plants. 
10 Measurements were made prior to seed harvest (most pods have reached maturity). The number of 

leaves was confirmed by counting the number of nodes to account for leaf drop. 



Table 4 

1 5 Tobacco Root Dry Weight of Plants Expressing 

the Exolifda Transgene 1 (pMON 17542) 
Root Dry Weight 
(g/plant) 

fda high expressors 64.0 ±3.9 

20 fda low expressors 62.7 ±5.4 

Control 31.7 ±1.6 

1 Roots from 5 high and 7 low expressing lines and 6 control plants were excised and washed 
carefully then placed in a 65°C drying oven for at least 48 hours. Roots were removed from the 
oven and allowed to equilibrate in the laboratory for 2 hours before dry weight determination. 
25 EXAMPLE 3 

Plant transformation and fda expression in corn plants 

Targeting of FDA protein 

Vectors containing ihzfda gene with and without the plastid targeting peptide 
were made for transformation in corn and are also suitable for other monocots, including 
30 rice, wheat, barley, sugarcane, triticale, etc. 
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For the cytosolic expression of the fda gene in corn plants, a construct was made 
in which the fda gene sequence was fused to the backbone of a vector containing the 
enhanced CaMV 35S promoter (e35S; Kay et al., 1987), the HSP70 intron (US patent 
5,593,874), and theNOS3' polyadenylation sequence (Fraley et al., 1983). This created a 
5 NotI cassette [P-e35S/HSP70 introrv/rftf/NOS3'] that was cloned into the Notl site of 
pMON30460, a monocot transformation vector, to form the plant transformation vector 
pMON13925, as shown in Figure 5. pMON30460 contains an expression cassette for the 
selectable marker neomycin phosphotransferase typell gene (nptll) [P-35S/NPTII /NOS3'] 
and a unique Notl site for cloning the gene of interest. The final vector (pMON13925) 
l o was constructed so that the gene of interest and the selectable marker gene were cloned in 
the same orientation. A vector fragment containing the expression cassettes for these gene 
sequences could be excised from the bacterial selector (Kan) and ori, gel purified, and used 
for plant transformation. 

For the chloroplast-targeted expression of the fda gene in corn plants, a construct 
1 5 was made in which the fda gene sequence, coupled to the maize RUBISCO small subunit 
CTP (Russell et al., 1993), was fused to the backbone of a vector containing the enhanced 
(CaMV) 35S promoter, the HSP70 intron, and the NOS3* polyadenylation sequences. This 
created a Notl cassette [P-e35S/HSP70 intron/mzSSuCTP//c?a/NOS3'] that was cloned 
into the Notl site (in the same orientation as the selectable marker cassette [P-35S/NPTII 
20 /NOS3']) of the monocot transformation vector pMON30460, to form the vector 

pMON17590, as shown in Figure 6. From this vector a fragment containing ihefda gene 
expression cassette and the selectable marker cassette could be excised from the bacterial 
selector (Kan) and ori, gel purified, and used for plant transformation. 

For the cytosolic endosperm-specific expression of the aldolase gene in corn, the 
25 fda gene sequence was cloned into a vector (in the same orientation as the selectable 
marker cassette[P-35S/NPTII /NOS3']) containing the glutelin gene promoter P-osgtl 
. (Zheng et al., 1993), the HSP70 intron, and theNOS3' polyadenylation sequences to form 
the vector pMON13936, as shown in Figure 7. From this vector a fragment containing the 
fda gene expression cassette [P-osgtl /HSP70intron//da/NOS3'] and the selectable marker 
30 cassette could be excised from the bacterial selector (Kan) and ori, gel purified, and used 
for plant transformation. 
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Maize plant transformation 

Transgenic maize plants transformed with the vectors pMON 13925 (described 
above) or pMON 17590 (described above) were produced using microprojectile 
bombardment, a procedure well-known to the art (Fromm, 1990; Gordon-Kamm et al, 

5 1990; Walters et al., 1992). Embryogenic callus initiated from immature maize embryos 
was used as a target tissue. Plasmid DNA at lmg/mL in TE buffer was precipitated onto 
Ml 0 tungsten particles using a calcium chloride / spermidine procedure, essentially as 
described by Klein et al. (1988). In addition to the gene of interest, the plasmids also 
contained the neomycin phosphotransferase II gene (nptll) driven by the 35S promoter 

1 0 from Cauliflower Mosaic Virus. The embryogenic callus target tissue was pretreated on 
culture medium osmotically buffered with 0.2M mannitol plus 0.2M sorbitol for 
approximately four hours prior to bombardment (Vain et al., 1 993). Tissue was 
bombarded two times with the DNA-coated tungsten particles using the gunpowder 
version of the BioRad Particle Delivery System (PDS) 1000 device. Approximately 16 

15 hours following bombardment, the tissue was subcultured onto a medium of the same 
composition except that it contained no mannitol or sorbitol, and it contained an 
appropriate aminoglycoside antibiotic, such as G418", to select for those cells that 
contained and expressed the 35S/nptII gene. Actively growing tissue sectors were 
transferred to fresh selective medium approximately every 3 weeks. About 3 months after 

20 bombardment, plants were regenerated from surviving embryogenic callus essentially as 
described by Duncan and Widholm (1988). 
Aldolase activity from transgenic maize 

In order to measure leaf aldolase activity, com leaf samples were taken and 
immediately frozen on dry ice. Aldolase enzyme was extracted from the leaf tissue by 

25 grinding the leaf tissue at 4°C in 1 .2 mL of the extraction buffer (1 00 mM Hepes, pH 8.0, 
5 mM MgCl 2 , 5"mM MnCl 2 , 100 mM KC1, 10 mM DTT, 1% BSA, 1 mM PMSF, 10 
.\ig/mL leupeptin, 10 ng/mL aprotinin). The extract was centrifuged at 15,000 x g, at 4°C 
for 3 minutes, and the non-desalted supernatant was assayed for enzyme activity. This 
extraction method gave about 60% recovery of £. colt FDA activity. 

30 Total aldolase activity was determined in 0.98 mL of reaction mixture that 

consisted of 100 mM EPPS-NaOH, pH 8.5, 1 mM fructose-bisphosphateJU mM NADH, 
5 mM MgCl 2 , 4 units of alpha-glycerophosphate dehydrogenase, and 15 units of 
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triosephosphate isomerase. The reaction was initiated by addition of 20 of leaf extract. 
The resulting data, generated from a single experiment, are presented in Table 5. . 

Table 5 



Aldolase Activity from Transgenic Maize Leaves 



Lines 


A340/min/20nL 


Activity % 


H99 (control) 


0.113 


100 


pMON 17590 


0.233 


206 


pMON13925 


0.251 


222 



A phenotype was visible in the primary transformants (RO plants) expressing the 

10 E. coli FDA when the protein was targeted to the chloroplast. The leaves were chlorotic 
but seed set was normal Rl plants were grown in both field and in greenhouse 
experiments. Starch was not detectable in the leaves using an iodine staining and 
pollination was delayed. It is believed that the phenotype in these corn plants may be the 
result of the promoter (e35S) used in both the pMON 17590 and pMON 13925 vectors not 

15 being preferred for causing FDA expression in corn. Because e35S is believed to cause 
mesophyll enhanced expression and the Calvin Cycle in a C4 plant such as corn occurs 
predominantly in the bundle sheath cells, the use of a promoter directing enhanced 
expression in the bundle sheath cells (such as the ssRUBlSCO promoter) may be 
preferred. Vectors containing such a promoter and driving expression of FDA have been 

20 prepared and are being tested in maize. 

In particular, the maize RuBISCO small subunit (PmzSSU, a bundle sheath cell- 
specific promoter) has been used to construct vectors for cell-specific fda expression in 
maize. A class I aldolase (fdal)> an fda without an iron sulfur cluster and with different 
properties from/M, was utilized to improve carbon metabolism in C4 crops (e.g. maize) 

25 . The gene for the class I aldolase was amplified from the genome of Staphylococcus 
aureus and activity was comfirmed. Transformation vectors were then constructed to 
express both classes of aldolase (fdal and fdall) in a cell-specific manner in maize. Hie 
following cassettes have been made: 
pMON13899: PmzSSU/hsp70/mzSSU CTP/fdal 

30 P MON13990PmzSSU/hsp70/mzSSU CTP/fdall 
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pMON13988:P35S/hsp70/#a/. 

These vectors were used for corn transformation as described generally above. The 
biochemical and physiological analysis of the primary transformants should allow for the 
identification of aldolase gene overexpression that will lead to increase growth and 

5 development and yield in maize. 

Also, two vectors were used for transformation of corn which would target the 
expression of the £ colifda //gene in the maize endosperm. The vector pMON 13936 
uses the rice gtl promoter to drive expression of aldolase in the cytoplasm of the 
endosperm cells. Another vector (pMON 36416) uses the same promoter with the maize 

10 RuBISCO small subunit transit peptide to localize the protein in the amyloplasts. 
Homozygous lines of the cytosolic aldolase transformants have been identified 
(Homozygosity of 37 plants was confirmed using western blot analysis) and seed from 
these plants were collected for grain composition analysis (moisture, protein, starch, and 
oil). Of the 53 pMON 36416 primary transformants screened for amylopast-targeted 

15 aldolase expression, 1 1 were positive. These plants will be tested for homozygosity 
selection/propagation and kernels from the homozygotes will be used for composition 
analysis. 
EXAMPLE 4 

Plant transformation and fda expression in potato plants 
20 Targeting of fda expression 

The plant expression vector. pMON 17542 (described earlier), in which the/cfa 
gene is expressed behind the FMV promoter and the aldolase enzyme is fused to the 
chloroplast transit peptide CTP2, was used for Agrobacterium-mizd\2Xz& potato 
transformation. 

25 A second potato transformation vector was constructed by cloning the NotI 

cassette [P-FMV/CTP2//rfo/NOS3'] (described earlier) into the unique NotI site of 
. pMON23616. pMON23616 is a potato transformation vector containing the nopaline-type 
T-DNA right border region (Fraley et al., 1985), an expression cassette for the neomycin 
phosphotransferase typell gene [P-35S/NPTII /NOS3'] (selectable marker), a unique NotI 

30 site for cloning the gene expression cassette of interest, and the T-DNA left border region 
(Barker et al., 1983). Cloning of the NotI cassette [P-FMV/CTP2//i/a/NOS3 > ] (described 
earlier) into the NotI site of pMON23616 results in the potato transformation vector 
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pMON17581, as shown in Figure 8. The vector pMON17581 was constructed such that 
the gene of interest and the selectable marker gene were transcribed in the same direction. 
Potato plant transformation 

The plant transformation vectors were mobilized into the ABI Agrobacterium 

5 strain. Mating of the plant vector into the ABI strain was done by the triparental 

conjugation system using the helper plasmid pRK2013 (Ditta et al., 1980). The vector 
pMON 17542 was used for potato transformation via Agrobacterium transformation of 
Russet Burbank potato callus, following the method described in PCT Publication WO 
96/03513 for glyphosate selection of transformed lines. 

l o After transformation with the vector pMON 1 7542, transgenic potato plantlets that 

came through selection on glyphosate were screened for expression of E. coli aldolase by 
leaf Western blot analysis. Out of 1 12 independent lines assayed. 50 /cfa-expressing lines 
(45%) were identified, with fda expression levels ranging between 0.12% and 1.2 % of 
total extractable protein. 

15 The plant transformation vector PMON 1 758 1 was used for Agrobacterium- 

mediated transformation of HS3 1-638 potato callus. HS3 1-638 is a Russet Burbank potato 
line previously transformed with the mutant ADPglucose pyrophosphorylase (glgCJ6) 
gene from E.coli (U.S. Patent 5,498,830). The potato callus was transformed following the 
method described in PCT Publication WO 96/03513, substituting kanamycin 

20 (administered at a concentration of 150-200 mg/L) for glyphosate as a selective agent. 

The transgenic potato plants were screened for expression of the fda gene by 
assaying leaf punches from tissue culture plantlets. Western blot analysis (using antibodies 
raised against the £. coli aldolase) of leaf tissue from the pMON 175 81 -transformed lines 
identified 12 expressing lines out of 56 lines screened. Expression was detected of a 

25 protein migrating at approximately 40 kDa, which is the molecular weight of the £. coli 
(classll) aldolase subunit and the size of the protein observed after overexpression of the 
. aldolase in E. coli. 

Specific gravity measurements of transgenic potato plants 

From the 50 /^-expressing potato lines obtained after transformation with 
30 pMONl 7542, 7 of the highest expressing lines were micropropagated in tissue culture, and 
8 copies of each line were planted in pots 14 inches in diameter and 12 ioches deep, 
containing a mixture of: l A Metro 350 potting media, V* fine sand, V* Ready Earth 
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potting media. Wild-type Russet Burbank plantlets from tissue culture were planted as 
controls. All plants were cultivated for approximately 5 months in the greenhouse in 
which daytime temperature was approximately 21-23°C while nighttime temperature was 
approximately 13°C. Plants were watered every other day throughout their active growing 

5 period and fertilized with Peter's 20-20-20 commercial fertilizer once a week, at levels 
similar to commercial applications. Fertilization was carried out only for the first 2 14 
months, at which point fertilization was stopped completely. Plants were allowed to 
naturally senesce, and at approximately 50% senescence, tubers were harvested. 

For each line at harvest, all tubers from all 8 pots were pooled and a total weight 

10 was obtained. Then for each line, tubers 30 g or greater were pooled and specific gravity 
was determined on this group of tubers. Specific gravity is the weight of the tubers in air 
divided by the weight in air minus the weight in water. Results of these weight 
measurements are presented in Table 6. 

Table 6 

15 Specific gravity measurements from transgenic potato plants 



Line # 


Total 


Overall 


Combined 


% Increase in 


Combined Weight of 


Specific 




Weight 


% Yield 


Weight 


Total Weight 


Tubers over 30g 


Gravity 






Increas 


of Tubers 


(T ubers over 


(% of Total Weight) 








e 


over 30g 


30g) 






RB 


6609 




4477 




67.70% 


1.087 


40652 


5138 


neg 


1307 


neg 


25.40% 


1.08 


40611 


7170 


8.5% 


4533 


1.3% 


63.20% 


1.083 


40608 


7470 


13.0% 


1070 


neg 


14.30% 


1.081 


40632 


7776 


21.8% 


5453 


21.8% 


70.10% 


1.088 


40614 


8688 


31.5% 


5468 


22.2% 


62.90% 


1.083 


40631 


8800 


33.2% 


6188 


38.2% 


70.30% 


1.084 


40610 


9746 


47.0% 


7777 


73.0% 


80% 


1.087 



This table summarizes the tuber yield and specific gravity for all seven lines grown in the 
greenhouse. The results indicate that, in comparison to the control, all but one of the fda 
lines show an increase in overall tuber yield, and that in four lines, there is a corresponding 

20 increase in percentage of tubers that weigh more than 30 g. For combined tubers over 30 
g, the percent of total weight is near that of the control, and for two lines is greater than the 
control. This indicates that five out of the six of the lines show higher overall yield and 
are not making smaller tubers. In other words, with the increase in overall yield, there is a 
corresponding increase in percentage of bigger tubers (over 30 g). However, there is no 

25 increase in specific gravity of the tubers. 
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In conclusion, it appears that expression of fda in potato produces greater numbers 
of tubers per plant without a sacrifice in tuber size. This represents a yield benefit in that 
the farmer could potentially be able to produce the same amount of tubers using less 
acreage. Similar experiments will also be performed by co-expression of fda with other 
5 carbohydrate metabolizing genes, such as glgC16, in order to determine how such 
combinations will affect tuber yield, tuber solids deposition and overall tuber specific 
gravity. 

Aldolase activity from transgenic potato 

After being cultivated for 3 months (post planting) in the greenhouse, leaf samples 
10 were taken from 6 of the highest jcfa-expressing potato lines, obtained after transformation 

with pMON17542, and assayed for aldolase activity. 

In order to measure potato leaf aldolase activity, duplicate leaf samples from each 

line were taken and immediately frozen on dry ice. Aldolase was extracted from 0.2 g of 

leaf tissue by grinding at 4°C in 1.2 mL of the extraction buffer: 100 mM Hepes, pH 8.0, 
15 5 mM MgCl 2 , 5 mM MnCl 2 , 100 mM KC1, 10 mM DTT, 1% BSA, ImM PMSF, 10 

jig/mL leupeptin, 10 ng/mL aprotinin. The extract was assayed for aldolase activity as 

described earlier. 

Six independent transgenic potato lines expressing/da were tested for aldolase 
activity. The expression of fda in leaves is an indicator of the expression in the whole 
20 plant because the FMV promoter used to drive expression of the respective encoding 
DNAs directs gene expression constitutively in most, if not all, tissues of potato plants. 

Table 7 summarizes the quantitative protein expression data for each of the lines, 
and the percent activity for each individual line. 
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Table 7 
Aldolase Activity from 
Transgenic Russet Burbank Potato Leaves 
Exp. #1 Exp. #2 Average 

5 Lines Act(U/gFW) %Act Act(U/gFW) %Act % Activity 



Control 


4.461 


100 


4.732 


100 


100 


40608 


6.969 


156 


8.055 


170 


163 


40610 


8.489 


190 


7.326 


155 


173 


40652 


5.812 


130 


6.367 


135 


132 


40632 


5.257 


118 


4.244 


90 


104 


40631 


5.764 


129 


4.968 


105 


117 


40611 


5.715 


128 


5.836 


123 


126 



Solids uniformity in transgenic potato 

15 Twenty-five Russet Burbank lines expressing/^ (potato lines designated 

"Maestro"), obtained after transformation with pMONl 7542, and fifteen Russet Burbank 
Simple Solid lines, also containing g/gC16 (PCT Publication WO 91/19806 and US Patent 
5,498,830), expressing /Ja (potato lines designated "Segal"), obtained after transformation 
with pMON 17581, were field tested at two different sites. For each field site, 36 plants 

20 per line (three repetitions of 12 plants per line) were evaluated for tuber solids distribution. 
At harvest, tubers were pre-sorted at each field site into a ten to twelve ounce category, 
and nine tubers from each replicated plot were analyzed in groups of three. 

For a typical 10-12 ounce tuber having a diameter of 7-8 cm, starch distribution 
was evaluated by removing the center longitudinal slice (13 mm) from each tuber. Slices 

25 were then peeled and laid flat on a cutting board where the inner tuber region (pith region) 
was removed by a 14-mm cork punch. The tissue from pith to cortex (perimedullary 
region) was removed by an up-to-a 2-inch cork punch. The remaining cortex tissue was 
approximately an 8-mm wide ring from the outermost region of the slice. 

Specific gravity was then determined by weighing both the pooled pith punches 

30 and pooled cortex punches in air and then in water: 

Specific gravity = Air Wt./(Air Wt.- Water Wt.) 
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After calculating specific gravity, solids levels were determined by the following equation: 
-214.9206 + (218.1852*Sp. Gravity) 

The degree of solids uniformity (Solids Uniformity Index) is determined by calculating the 

pith to cortex solids ratio (pith solids divided by cortex solids). The three groups of three 
5 tubers per plot were averaged, at which point the average of three plot replications was 

calculated per field site. 

Analyses of several previous solids uniformity field trials (data not shown) have 

demonstrated nontransgenic, wild-type Russet Burbank potato to have a typical pith to 

cortex tuber solids ratio within the range of 68% to 72%, depending on growing region 
10 and agricultural practices. Tables 8-1 1 provide the pith to cortex solids ratios by plant line 

number, with a higher pith to cortex solids ratio indicating a greater degree of solids 

uniformity. 

Tables 8 and 9 represent the data from one field site (site 1) for Segal and Maestro, 
respectively, and illustrate that the majority of Segal and Maestro lines have higher pith to 

15 cortex solids ratios than that of 68.4% for the Russet Burbank control, with some lines 
approaching an 82% pith to cortex solids ratio. 

Tables 10 and 1 1 represent the data from another field site (site 2) for Segal and 
Maestro, respectively, and also illustrate that the majority of Maestro and Segal lines have 
higher pith to cortex solids ratios than that of the Russet Burbank control, with some lines 

20 approaching an 88% pith to cortex solids ratio. In the site 2 field trial, the Russet Burbank 
control had an atypical, abnormally high pith-to-cortex solids uniformity ratio of 79.3%, 
which was most likely due to environmental growing conditions. The site 2 results 
demonstrate that expression in Russet Burbank potato of E. colifda, alone or with co- 
expression of g/gC16, increases tuber solids uniformity even in a growing season when 

25 tuber solids uniformity is already extremely high in nontransgenic Russet Burbank. That 
is, the fda gene continues to perform when agricultural conditions are already conducive to 
. an abnormally high solids uniformity level. 
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Table 8. Solids Uniformity Index: Pith Solids to Cortex Solids Ratio. 
Segal Russet Burbank Lines. Site 1 



Line 


Ratio 


S^29 


79.1 


S-9 


75.8 


S-20 


71.3 


S-15 


71.3 


S-21 


70.5 


S-5 


70.2 


S-18 


70.0 


RB control 


68.4 


S-32 


68.3 


S-16 


65.6 



15 

Table 9. Solids Uniformity Index: Pith Solids to Cortex Solids Ratio. 
Maestro Russet Burbank Lines. Site 1 





Line 


Ratio 


20 


M-13 


74.0 




M-12 


73.6 




M-l 


73.4 




M-3 


73.0 




M-6 


72.4 


25 


M-9 


71.2 




M-ll 


70.6 




M-18 


70.5 




M-l 7 


69.9 




M-19 


69.4 


30 


M-5 


69.3 




M-20 


68.9 




RB control 


68.4 




M-8 


68.3 




M-43 


67.7 


35 


M-23 


67.3 




M-7 


67.0 




M-39 . 


66.6 




M-22 


66.0 




M-10 


65.4 


40 


. M-27 


61.4 
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Table 10. Solids Uniformity Index: Pith Solids to Cortex Solids Ratio 
Segal Russet Burbank Lines. Site 2 



T ine 


Ratio 


J"JJ 


O / .*T 


S-54 


87.1 


S-05 


86.8 


S-29 


85.1 


S-21 


84.3 


S-16 


83.2 


S-20 


81.5 


S-18 


80.7 


S-32 


80.6 


RB control 


79.3 


S-09 


79.0 



Table 11. Solids Uniformity Index: Pith Solids to Cortex Solids Ratio 
Maestro Russet Burbank Lines. Site 2 



20 


Line 


Ratio 




M-04 


87.7 




M-18 


83.9 




M-17 


83.8 




M-03 


83.7 


25 


M-09 


83.4 




M-15 


83.2 




M-29 


82.9 




M-44 


82.3 




M-08 


82.2 


30 


M-43 


81.6 




M-22 


81.1 




M-05 


80.8 




M-01 


80.5 




M-20 


80.2 


35 


M-45 


79.6 




M-39 


79.5 




M-27 . 


79.5 




RB control 


79.3 




M-13 


78.9 


40 


. M-22 


78.8 




M-19 


78.7 




M-07 


78.2 




M-12 


77.9 




M-23 


77.3 


45 


M-06 


76.5 




M-10 


75.0 




M-ll 


74.1 
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The effect of aldolase on pith to cortex solids ratios in the Segal lines is slightly 
more dramatic than in Maestro lines. We believe this phenotype is due to expression of 
fda in a background in which the Russet Burbank host expresses glgC 16 at a relatively low 
to moderate level, and that the combination of fda plus g/gC16 provides improved 

5 benefits. Cross sectional tuber slices (Figure 9) of three Segal lines with improved solids 
uniformity illustrate a greater deposition of starch within the inner regions of the tuber. 
Specifically, an increase in cortex volume accompanied by relocation of the xylem ring 
towards the center of the tuber, plus a more opaque pith tissue due to an increase in starch 
density, are evident in the transgenic lines. This physiological alteration may be due to an 

10 increase in sucrose translocation from source to sink, which may influence phloem 
element distribution during tuber development or sucrose availability for starch 
biosynthesis across the tuber. 
Example 5 

Plant transformation and FDA expression in cotton plants 

15 The E. colifda vectors pMON 1 7524 [FMV/CTP \lfda) (Figure 2) and 

pMON 17542 [YMVICTPVfda] (Figure 3) were transformed into cotton using 
Agrobacterium as described by Umbeck et al. (1987) and in US Patent 5004863. The 
protein was targeted to the chloroplast using either the Arabidopsis SSU CTP 1 
(pMON 17524) or the Arabidopsis EPSPS (pMON 17542) chloroplast transit peptide. 

20 Aldolase expression in cotton 

Five-week-old calli transformed with both vectors were analyzed by Western blot 
analyses and by aldolase assays. Western blot analysis indicated a large amount of protein 
at the position of the full-length FDA standard and a lesser amount at the same position in 
the control callus extracts. It appeared that the protein was fully processed. To verify that 

25 FDA was expressed in the tissue and for comparison of activity, calli transformed with the 
two vectors were extracted in a buffer that would prevent loss of activity of the transgene 
. product. BSA was added to final concentration of 1 mg/mL, which limited the analysis of 
processing on import by Western blot. Aldolase assays were performed plus or minus 25 
mM EDTA, which inhibits the E. coli enzyme but not the plant enzyme. The results of the 

30 assays are shown in Table 12. 
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10 



15 



Table 12 

Aldolase Activity in Cotton Calli and Cotton Leaf 
A A340 e"3/mg protein/5 min 



Colony* 



Controls 

Cotton Leaf(Coker) 
Uninoculated Calli 
Inoculated Calli (E35S/GUS) #1 

#2 

FDA calli 

pMON 17542 #1 

#3 
#5 
#4 

pMON 17524 #2 
#3 
#5 
#6 



-EDTA +EDTA Fold Increase 



4.0 
7.7 
6.8 
3.5 

3.5 
5.5 
9.2 
19.8 
15.2 
12.5 
14.4 
4.1 



4.2 

5.6 
6.1 
4.0 

2.3 
2.6 
3.8 
3.6 
5.8 
4.0 
2.9 
1.2 



1.3X 



1.5X 
2.1X 
2.4X 
5.5X 
2.6X 
3.1X 
4.9X 
3.5X 



20 The results indicate that there is good expression of the fda gene in cotton callus. Almost all 
calli had at least twofold higher aldolase activity, and the increase was sensitive to inhibition by 
EDTA. Processing appeared complete by Western blot analysis using these samples. 



25 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATGTCTAAGA TTTTTGATTT CGTAAAACCT GGCGTAATCA CTGGTGATGA CGTACAGAAA 
10 60 

GTTTTCCAGG TAGCAAAAGA AAACAACTTC GCACTGCCAG CAGTAAACTG CGTCGGTACT 
120 

15 GACTCCATCA ACGCCGTACT GGAAACCGCT GCTAAAGTTA AAGCGCCGGT TATCGTTCAG 
180 

TTCTCCAACG GTGGTGCTTC CTTTATCGCT GGTAAAGGCG TGAAATCTGA CGTTCCGCAG 
24 0 

20 

GGTGCTGCTA TCCTGGGCGC GATCTCTGGT GCGCATCACG TTCACCAGAT GGCTGAACAT 
300 

TATGGTGTTC CGGTTATCCT GCACACTGAC CACTGCGCGA AGAAACTGCT GCCGTGGATC 
25 360 

GACGGTCTGT TGGACGCGGG TGAAAAACAC TTCGCAGCTA CCGGTAAGCC GCTGTTCTCT 
420 

30 TCTCACATGA TCGACCTGTC TGAAGAATCT CTGCAAGAGA ACATCGAAAT CTGCTCTAAA 
460 

TACCTGGAGC GCATGTCCAA AATCGGCATG ACTCTGGAAA TCGAACTGGG TTGCACCGGT 
540 

35 

GGTGAAGAAG ACGGCGTGGA CAACAGCCAC ATGGACGCTT CTGCACTGTA CACCCAGCCG 
600 

GAAGACGTTG ATTACGCATA CACCGAACTG AGCAAAATCA GCCCGCGTTT CACCATCGCA 
40 660 

GCGTCCTTCG GTAACGTACA CGGTGTTTAC AAGCCGGGTA ACGTGGTTCT GACTCCGACC 
720 

45 ATCCTGCGTG ATTCTCAGGA ATATGTTTCC AAGAAACACA ACCTGCCGCA CAACAGCCTG 
780 

. AACTTCGTAT TCCACGGTGG TTCCGGTTCT ACTGCTCAGG AAATCAAAGA CTCCGTAAGC 
840 . 

50 

TACGGCGTAG TAAAAATGAA CATCGATACC GATACCCAAT GGGCAACCTG GGAAGGCGTT 
900 

CTGAACTACT ACAAAGCGAA CGAAGCTTAT CTGCAGGGTC AGCTGGGTAA CCCGAAAGGC 
55 960 

GAAGATCAGC CGAACAAGAA ATACTACGAT CCGCGCGTAT GGCTGCGTGC CGGTCAGACT 
1020 
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TCGATGATCG CTCGTCTGGA GAAAGCATTC CAGGAACTGA ACGCGATCGA CGTTCTGTAA 
1080 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359anu.no acids 

(B) TYPE: amino 

(C) STRANDEDNESS : 

(D) TOPOLOGY: Linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 

Met Ser Lys He Phe Asp Phe Val Lys Pro Gly Val He Thr Gly 
5 10 15 

Asp Asp Val Gin Lys Val Phe Gin Val Ala Lys Glu Asn Asn Phe 
20 25 30 

Ala Leu Pro Ala Val Asn Cys Val Gly Thr Asp Ser He Asn Ala 
35 40 45 

Val Leu Glu Thr Ala Ala Lys Val Lys Ala Pro Val He Val Gin 
50 55 60 

Phe Ser Asn Gly Gly Ala Ser Phe He Ala Gly Lys Gly Val Lys 
65 70 75 

Ser Asp Val Pro Gin Gly Ala Ala He Leu Gly Ala He Ser Gly 
80 85 90 

Ala His His Val His Gin Met Ala Glu His Tyr Gly Val Pro Val 
95 100 105 

He Leu His Thr Asp His Cys Ala Lys Lys Leu Leu Pro Trp He 
110 115 120 

Asp Gly Leu Leu Asp Ala Gly Glu Lys His Phe Ala Ala Thr Gly 
125 120 135 

Lys Pro Leu Phe Ser Ser His Met He Asp Leu Ser Glu Glu Ser 
140 145 150 

Leu Gin Glu Asn He Glu He Cys Ser Lys Tyr Leu Glu Arg Met 
155 160 165 

Ser Lys He Gly Met Thr Leu Glu He Glu Leu Gly Cys Thr Gly 
170 175 180 

Gly Glu Glu Asp Gly Val Asp Asn Ser His Met Asp Ala Ser Ala 
185 190 195 

Leu Tyr Thr Gin Pro Glu Asp Val Asp Tyr Ala Tyr Thr Glu Leu 
200 205 210 
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Ser Lys lie Ser Pro Arg Phe Thr He Ala Ala Ser Phe Gly Asn 
215 220 225 

Val His Gly Val Tyr Lys Pro Gly Asn Val Val Leu Thr Pro Thr 
230 235 240 

He Leu Arg Asp Ser Gin Glu Tyr Val Ser Lys Lys His Asn Leu 
245 250 255 

Pro His Asn Ser Leu Asn Phe Val Phe His Gly Gly Ser Gly Ser 
260 265 270 

Thr Ala Gin Glu He Lys Asp Ser Val Ser Tyr Gly Val Val Lys 
275 280 285 

Met Asn He Asp Thr Asp Thr Gin Trp Ala Thr Trp Glu Gly Val 
290 295 300 

Leu Asn Tyr Tyr Lys Ala Asn Glu Ala Tyr Leu Gin Gly Gin Leu 
305 310 315 

Gly Asn Pro Lys Gly Glu Asp Gin Pro Asn Lys Lys Tyr Tyr Asp 
320 325 330 

Pro Arg Val Trp Leu Arg Ala Gly Gin Thr Ser Met He Ala Arg 
335 340 345 



Leu Glu Lys Ala Phe Gin Glu Leu Asn Ala lie Asp Val Leu 
350 355 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GGGGCCATGG CTAAGATTTT TGATTTCGTA 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
CCCCGAGCTC TTACAGAACG TCGATCGCGT TCAG 
(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 10847 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: Linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 





1 


CGATAAGCTT 




51 


TTP GATTGCT 


10 


101 


GCAATGGTGT 




i m 


PAAPGPAJIAT 




9m 


m f*p*T*p2_TPPG 






TZl l\'T v TY!nP'PP 
i/vil iuuw X v— 




i m 


wV_w X ULh I w\- 


1 < 

1 J 




V- 1 V* X It i 1 




4U1 








UU 1L1 1 1 1 Vj w 




cni 

3U1 


X i\ i\3 w\J X V»V_ V» 




c e n 

J 31 


1 1 VJVJ X M/tt_V3\J 


?o 


OU1 


1 UWUiC 1 VJf 




OS! 


v_ w/i 1 AutAL X 




/Ul 


vj 1 Vj 1 VJ 1 1 Vji-LK 




TCI 
/ D± 


VJVJ X wi"! 1 V— w X V_ 




OU1 


UiL U X nLnuu 




oDl 


TTG PTfJRTPT 




on i 

?U1 


arTPfiTfiarr 

/\v_ l \_Vj 1 VAH.V.. V- 




cam 


pcs TTrc zvc-z.pt 

LU X X UfAuKV, X 




xuui 


GTAAGPTPAP 




1U31 


vjL.1 X Xv_v_w\.X 


30 


1101 








ZVGGAAATGGG 

noutnnn 1 VJf wis 




1201 


GAAGAPGTGG 




1251 


TGTTCCAGAA 




1301 


PTGTTGPZVGP 


35 


1351 


GAACTPPGTG 




1401 


CAAGPTPAAP 




14 51 


GTGGTPGTPP 




1501 


GCTACCCACP 




1551 
ijji 


CGTTTCTGAA 


40 


1601 


GCTTPP.PAGA 




ID Jl 


CTCTCCGAPA 




1 rul 


pnfzziTppzipp 




1751 


CAATGCATCA 




1801 


ATTATGGCAT 


45 


1851 


ATTTACTGTG 




1901 


tggatGgaga 




1951 


TTAATATTAT 




. 2001 


ATAAGAGATA 




2051 


GCCTCTAATG 


50 


2101 


CATTATGCTT 




2151 


TGCAAATGTT 




2201 


GAACTTTCCT 




2251 


GCTTTATAAT 




2301 


TTTTTTAATG 


55 


2351 


CGACCTGCAG 




2401 


AGCAGCATTC 




2451 


TATTCAAATT 




2501 


TTTGTAAGGA 



GATGTAATTG GAGGAAGATC 
TCAATTGAAG TTTCTCCGAT 
GCAGAACCCA TCTCTTATCT 
CTCCCTTATC GGTTTCTCTG 
ATTTCGTCGT CGTGGGGATT 
TGAGCTTCGT CCTCTTAAGG 
TTCACGGTGC AAGCAGCCGT 
TCTGGAACCG TCCGTATTCC 
CATGTTTGGA GGTCTCGCTA 
AAGGTGAAGA TGTTATCAAC 
AGAATCCGTA AGGAAGGTGA 
TGGACTCCTT GCTCCTGAGG 
GTTGCCGTTT GACTATGGGT 
TTCATTGGTG ACGCTTCTCT 
CCCACTTCGC GAAATGGGTG 
TTCCAGTTAC CTTGCGTGGA 
GTACCTATGG CTTCCGCTCA 
CAACACCCCA GGTATCACCA 
ACACTGAAAA GATGCTTCAA 
GATGCTGACG GTGTGCGTAC 
CGGTCAAGTG ATTGATGTTC 
TGGTTGCTGC CTTGCTTGTT 
TTGATGAACC CAACCCGTAC 
TGCCGACATC GAAGTGATCA 
CTGACTTGCG TGTTCGTTCT 
GACCGTGCTC CTTCTATGAT 
TGCATTCGCT GAAGGTGCTA 
TTAAGGAAAG CGACCGTCTT 
GGTGTTGATT GCGATGAAGG 
TGACGGTAAG GGTCTCGGTA 
TCGATCACCG TATCGCTATG 
AACCCTGTTA CTGTTGATGA 
GTTCATGGAT TTGATGGCTG 
CTAAGGCTGC TTGATGAGCT 
TTTCGTTCGT ATCATCGGTT 
GTTTCATTGC GCACACACCA 
TGGGAAAACT GTTTTTCTTG 
TTTTTTATTC GGTTTTCGCT 
AGAGTTAATG AATGATATGG 
TTGTTTTTTC TCTTATTTGT 
TGCAAACATT TTGTTTTGAG 
ACCGAAGTTA ATATGAGGAG 
ATTCACTAGG CAACAAATAT 
ACTGAATACA AGTATGTCCT 
TTATGTAATT TTCCAGAATC 
TATAGTTATA CTCATGGATT 
CATTTTATGA CTTGCCAATT 
CCACTCGAAG CGGCCGCGTT 
CAGATTGGGT TCAATCAACA 
GGTATCGCCA AAACCAAGAA 
AGAATTCTCA GTCCAAAGCC 

51 



AAAATTTTCA ATCCCCATTC 
GGCGCAAGTT AGCAGAATCT 
CCAATCTCTC GAAATCCAGT 
AAGACGCAGC AGCATCCACG 
GAAGAAGAGT GGGATGACGT 
TCATGTCTTC TGTTTCCACG 
CCAGCAACTG CTCGTAAGTC 
AGGTGACAAG TCTATCTCCC 
GCGGTGAAAC TCGTATCACC 
ACTGGTAAGG CTATGCAAGC 
TACTTGGATC ATTGATGGTG 
CTCCTCTCGA TTTCGGTAAC 
CTTGTTGGTG TTTACGATTT 
CACTAAGCGT CCAATGGGTC 
TGCAGGTGAA GTCTGAAGAC 
CCAAAGACTC CAACGCCAAT 
AGTGAAGTCC GCTGTTCTGC 
CTGTTATCGA GCCAATCATG 
GGTTTTGGTG CTAACCTTAC 
CATCCGTCTT GAAGGTCGTG 
CAGGTGATCC ATCCTCTACT 
CCAGGTTCCG ACGTCACCAT 
TGGTCTCATC TTGACTCTGC 
ACCCACGTCT TGCTGGTGGA 
TCTACTTTGA AGGGTGTTAC 
CGACGAGTAT CCAATTCTCG 
CCGTTATGAA CGGTTTGGAA 
TCTGCTGTCG CAAACGGTCT 
TGAGACTTCT CTCGTCGTGC 
ACGCTTCTGG AGCAGCTGTC 
AGCTTCCTCG TTATGGGTCT 
TGCTACTATG ATCGCTACTA 
GTCTTGGAGC TAAGATCGAA 
CAAGAATTCG AGCTCGGTAC 
TCGACAACGT TCGTCAAGTT 
GAATCCTACT GAGTTCGAGT 
TACCATTTGT TGTGCTTGTA 
ATCGAACTGT GAAATGGAAA 
TCCTTTTGTT CATTCTCAAA 
TGTGTGTTGA ATTTGAAATT 
TAAAAATGTG TCAAATCGTG 
TAAAACACTT GTAGTTGTAC 
ATTTTCAGAC CTAGAAAAGC 
CTTGTGTTTT AGACATTTAT 
CTTGTCAGAT TCTAATCATT 
TGTAGTTGAG TATGAAAATA 
GATTGACAAC ATGCATCAAT 
CAAGCTTGAG CTCAGGATTT 
AGGTACGAGC CATATCACTT 
GGAACTCCCA TCCTCAAAGG 
TCAACAAGGT CAGGGTACAG 
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2551 AGTCTCCAAA CCATTAGCCA AAAGCTACAG GAGATCAATG AAGAATCTTC 
2601 AATCAAAGTA AACTACTGTT CCAGCACATG CATCATGGTC AGTAAGTTTC 
2651 AGAAAAAGAC ATCCACCGAA GACTTAAAGT TAGTGGGCAT CTTTGAAAGT 
2701 AATCTTGTCA ACATCGAGCA GCTGGCTTGT GGGGACCAGA CAAAAAAGGA 
5 2751 ATGGTGCAGA ATTGTTAGGC GCACCTACCA AAAGCATCTT TGCCTTTATT 
2801 GCAAAGATAA AGCAGATTCC TCTAGTACAA GTGGGGAACA AAATAACGTG 
2851 GAAAAGAGCT GTCCTGACAG CCCACTCACT AATGCGTATG ACGAACGCAG 
2901 TGACGACCAC AAAAGAATTC CCTCTATATA AGAAGGCATT CATTCCCATT 
2951 TGAAGGATCA TCAGATACTG AACCAATCCT TCTAGAAGAT CTCCACAATG 
10 3001 GCTTCCTCTA TGCTCTCTTC CGCTACTATG GTTGCCTCTC CGGCTCAGGC 
3051 CACTATGGTC GCTCCTTTCA ACGGACTTAA GTCCTCCGCT GCCTTCCCAG 
3101 CCACCCGCAA GGCTAACAAC GACATTACTT CCATCACAAG CAACGGCGGA 
3151 AGAGTTAACT GCATGCAGGT GTGGCCTCCG ATTGGAAAGA AGAAGTTTGA 
3201 GACTCTCTCT TACCTTCCTG ACCTTACCGA TTCCGGTGGT CGCGTCAACT 
15 3251 GCATGCAGGC CATGGCTAAG ATTTTTGATT TCGTAAAACC TGGCGTAATC 
3301 ACTGGTGATG ACGTACAGAA AGTTTTCCAG GTAGCAAAAG AAAACAACTT 
3351 CGCACTGCCA GCAGTAAACT GCGTCGGTAC TGACTCCATC AACGCCGTAC 
3401 TGGAAACCGC TGCTAAAGTT AAAGCGCCGG TTATCGTTCA GTTCTCCAAC 
3451 GGTGGTGCTT CCTTTATCGC TGGTAAAGGC GTGAAATCTG ACGTTCCGCA 
20 3501 GGGTGCTGCT ATCCTGGGCG CGATCTCTGG TGCGCATCAC GTTCACCAGA 
3551 TGGCTGAACA TTATGGTGTT CCGGTTATCC TGCACACTGA CCACTGCGCG 
36 01 AAGAAACTGC TGCCGTGGAT CGACGGTCTG TTGGACGCGG GTGAAAAACA 
3651 CTTCGCAGCT ACCGGTAAGC CGCTGTTCTC TTCTCACATG ATCGACCTGT 
3701 CTGAAGAATC TCTGCAAGAG AACATCGAAA TCTGCTCTAA ATACCTGGAG 
25 3751 CGCATGTCCA AAATCGGCAT GACTCTGGAA ATCGAACTGG GTTGCACCGG 
3801 TGGTGAAGAA GACGGCGTGG ACAACAGCCA CATGGACGCT TCTGCACTGT 
3851 ACACCCAGCC GGAAGACGTT GATTACGCAT ACACCGAACT GAGCAAAATC 
3901 AGCCCGCGTT TCACCATCGC AGCGTCCTTC GGTAACGTAC ACGGTGTTTA 
3 951 CAAGCCGGGT AACGTGGTTC TGACTCCGAC CATCCTGCGT GATTCTCAGG 
30 4001 AATATGTTTC CAAGAAACAC AACCTGCCGC ACAACAGCCT GAACTTCGTA 
4051 TTCCACGGTG GTTCCGGTTC TACTGCTCAG GAAATCAAAG ACTCCGTAAG 
4101 CTACGGCGTA GTAAAAATGA ACATCGATAC CGATACCCAA TGGGCAACCT 
4151 GGGAAGGCGT TCTGAACTAC TACAAAGCGA ACGAAGCTTA TCTGCAGGGT 
4201 CAGCTGGGTA ACCCGAAAGG CGAAGATCAG CCGAACAAGA AATACTACGA 
35 4251 TCCGCGCGTA TGGCTGCGTG CCGGTCAGAC TTCGATGATC GCTCGTCTGG 
4301 AGAAAGCATT CCAGGAACTG AACGCGATCG ACGTTCTGTA AGAGCTCGGT 
4351 ACCGGATCCA ATTCCCGATC GTTCAAACAT TTGGCAATAA AGTTTCTTAA 
4401 GATTGAATCC TGTTGCCGGT CTTGCGATGA TTATCATATA ATTTCTGTTG 
4451 AATTACGTTA AGCATGTAAT AATTAACATG TAATGCATGA CGTTATTTAT 
40 4501 GAGATGGGTT TTTATGATTA GAGTCCCGCA ATTATACATT TAATACGCGA 
4551 TAGAAAACAA AATATAGCGC GCAAACTAGG ATAAATTATC GCGCGCGGTG 
4601 TCATCTATGT TACTAGATCG GGGATCGATC CCCGGGCGGC CGCCACTCGA 
4651 GTGGTGGCCG CATCGATCGT GAAGTTTCTC ATCTAAGCCC CCATTTGGAC 
4701 GTGAATGTAG ACACGTCGAA ATAAAGATTT CCGAATTAGA ATAATTTGTT 
45 4751 TATTGCTTTC GCCTATAAAT ACGACGGATC GTAATTTGTC GTTTTATCAA 
4801 AATGTACTTT CATTTTATAA TAACGCTGCG GACATCTACA TTTTTGAATT 
4851 GAAAAAAAAT TGGTAATTAC TCTTTCTTTT TCTCCATATT GACCATCATA 
4901 CTCATTGCTG ATCCATGTAG ATTTCCCGGA CATGAAGCCA TTTACAATTG 
4951 AATATATCCT GCCGCCGCTG CCGCTTTGCA CCCGGTGGAG CTTGCATGTT 
50 5001 GGTTTCTACG CAGAACTGAG CCGGTTAGGC AGATAATTTC CATTGAGAAC 
5051 TGAGCCATGT GCACCTTCCC CCCAACACGG TGAGCGACGG GGCAACGGAG 
5101 TGATCCACAT GGGACTTTTc CTAGCTTGGC TGCCATTTTT GGGGTGAGGC 
5151 CGTTCGCGCG GGGCGCCAGC TGGGGGGATG GGAGGCCCGC GTTACCGGGA 
5201 GGGTTCGAGA AGGGGGGGCA CCCCCCTTCG GCGTGCGCGG TCACGCGCCA 
55 5251 GGGCGCAGCC CTGGTTAAAA ACAAGGTTTA TAAATATTGG TTTAAAAGCA 
5301 GGTTAAAAGA CAGGTTAGCG GTGGCCGAAA AACGGGCGGA AACCCTTGCA 
5351 AATGCTGGAT TTTCTGCCTG TGGACAGCCC CTCAAATGTC AATAGGTGCG 
5401 CCCCTCATCT GTCATCACTC TGCCCCTCAA GTGTCAAGGA TCGCGCCCCT 
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5451 


CATCTGTCAG 




5501 


CCCCAGGCTT 




5551 


TTTCGCCGAT 




5601 


GCCTGCCCCT 


5 


5651 


GTCAACGTCC 




5701 


ATCCACAACG 




5751 


CGTTTCTGG C 




5801 


GGGCAACCAG 




5851 


GAGAGCCTTC 


!0 


5901 


TCGTCGCCGC 




5951 


GTGCCGGCAG 




6001 


CGCGACGATG 




6051 


TCGCTCAAGC 




6101 


CAGGCCATTA 


15 


6151 


GGCGTTCGCG 




6201 


CTTCCGGCGG 




6251 


G TAG ATGAC G 




6301 


CAGCCTAACT 




6351 


CCTCGGCGAG 


20 


6401 






6451 


CTCGACCTGA 




6501 


AAGAATTGGA 




6551 


ACCCTTGGCA 




6601 


CGG C GCATCT 


25 


6651 


GCTCCTGTCG 




6701 


AGCAGAATGA 




6751 


AAAAC GTCTG 




6801 


TTC GTAAAGT 




6851 


GGATCTGCAT 


30 


6901 


GTATTAACGA 




6951 


CCG CATCCAT 




7001 






7051 


G GT ATCATT A 




7101 


GTGACCAAAC 

\J A %J«^WW^W^W 


35 


7151 


CCAGACATTA 




7201 


AGGCAGACAT 




7251 


AGCTGCCTCG 




7301 


GCTCCCGGAG 




7351 


AAGCCCGTCA 


40 


7401 


ATGACCCAGT 




7451 


GCATCAGAGC 




7501 


GCACAGATGC 




7551 


CG CTCACTGA 




7601 


GCTCACTCAA 


45 


7651 


AGGAAAGAAC 




7701 


AGGCCGCGTT 




7751 


CACAAAAATC 




7801 


AAGATACCAG 




/ O — X 




50 


7901 


GTGGCGCTTT 




7951 


CGTTCGCTCC 




8001 


GCTGCGCCTT 




8051 


GACTTATCGC 




8101 


GTATGTAGGC 


55 


8151 


ACACTAGAAG 




8201 


TTCGGAAAAA 




8251 


TAGCGGTGGT 




8301 


GATCTCAAGA 



TAGTCGCGCC CCTCAAGTGT 
GTCCACATCA TCTGTGGGAA 
TTGCGAGGCT GGCCAGCTCC 
CATCTGTCAA CGCCGCGCCG 
GCCCCTCATC TGTCAGTGAG 
CCGGCGGCCG GCCGCGGTGT 
GCGTTTGCAG GGCCATAGAC 
CCCGGTGAGC GTCGGAAAGG 
AACCCAGTCA GCTCCTTCCG 
ACTTATGACT GTCTTCTTTA 
CGCTCTGGGT CATTTTCGGC 
ATCGGCCTGT CGCTTGCGGT 
CTTCGTCACT GGTCCCGCCA 
TCGCCGGCAT GGCGGCCGAC 
ACGCGAGGCT GGATGGCCTT 
CATCGGGATG CCCGCGTTGC 
ACCATCAGGG ACAGCTTCAA 
TCGATCACTG GACCGCTGAT 
CACATGGAAC GGGTTGGCAT 
GCCTCCCCGC GTTGCGTCGC 
ATGGAAGCCG GCGGCACCTC 
GCCAATCAAT TCTTGCGGAG 
GAACATATCC ATCGCGTCCG 
CGGGCAGCGT TGGGTCCTGG 
TTGAGGACCC GGCTAGGCTG 
ATCACCGATA CGCGAGCGAA 
CGACCTGAGC AACAACATGA 
CTGGAAACGC GGAAGTCAGC 
CGCAGGATGC TGCTGGCTAC 
AGCGCTGGCA TTGACCCTGA 
ACCGCCAGTT GTTTACCCTC 
ATCAGTAACC CGTATCGTGA 
CCCCCATGAA CAGAAATTCC 
AGGAAAAAAC CGCCCTTAAC 
ACGCTTCTGG AGAAACTCAA 
CTGTGAATCG CTTCACGACC 
CGCGTTTCGG TGATGACGGT 
ACGGTCACAG CTTGTCTGTA 
GGGCGCGTCA GCGGGTGTTG 
CACGTAGCGA TAGCGGAGTG 
AGATTGTACT GAGAGTGCAC 
GTAAGGAGAA AATACCGCAT 
CTCGCTGCGC TCGGTCGTTC 
AGGCGGTAAT ACGGTTATCC 
ATGTGAGCAA AAGGCCAGCA 
GCTGGCGTTT TTCCATAGGC 
GACGCTCAAG TCAGAGGTGG 
GCGTTTCCCC CTGGAAGCTC 
GCTTACCGGA TACCTGTCCG 
CTCATAGCTC ACGCTGTAGG 
AAGCTGGGCT GTGTGCACGA 
ATCCGGTAAC TATCGTCTTG 
CACTGGCAGC AGCCACTGGT 
GGTGCTACAG AGTTCTTGAA 
GACAGTATTT GGTATCTGCG 
GAGTTGGTAG CTCTTGATCC 
TTTTTTGTTT GCAAGCAGCA 
AGATCCTTTG ATCTTTTCTA 



CAATACCGCA GGGCACTTAT 
ACTCGCGTAA AATCAGGCGT 
ACGTCGCCGG CCGAAATCGA 
GGTGAGTCGG CCCCTCAAGT 
GGCCAAGTTT TCCGCGTGGT 
CTCGCACACG GCTTCGACGG 
GGCCGCCAGC CCAGCGGCGA 
GTCGATCGAC CGATGCCCTT 
GTGGGCGCGG GGCATGACTA 
TCATGCAACT CGTAGGACAG 
GAGGACCGCT TTCGCTGGAG 
ATTCGGAATC TTGCACGCCC 
CCAAACGTTT CGGCGAGAAG 
GCGCTGGGCT ACGTCTTGCT 
CCCCATTATG ATTCTTCTCG 
AGGCCATGCT GTCCAGGCAG 
GGATCGCTCG CGGCTCTTAC 
CGTCACGGCG ATTTATGCCG 
GGATTGTAGG CGCCGCCCTA 
GGTGCATGGA GCCGGGCCAC 
GCTAACGGAT TCACCACTCC 
AACTGTGAAT GCGCAAACCA 
CCATCTCCAG CAGCCGCACG 
CCACGGGTGC GCATGATCGT 
GCGGGGTTGC CTTACTGGTT 
CGTGAAGCGA CTGCTGCTGC 
ATGGTCTTCG GTTTCCGTGT 
GCCCTGCACC ATTATGTTCC 
CCTGTGGAAC ACCTACATCT 
GTGATTTTTC TCTGGTCCCG 
ACAACGTTCC AGTAACCGGG 
GCATCCTCTC TCGTTTCATC 
CCCTTACACG GAGGCATCAA 
ATGGCCCGCT TTATCAGAAG 
CGAGCTGGAC GCGGATGAAC 
ACGCTGATGA GCTTTACCGC 
GAAAACCTCT GACACATGCA 
AGCGGATGCC GGGAGCAGAC 
GCGGGTGTCG GGGCGCAGCC 
TATACTGGCT TAACTATGCG 
CATATGCGGT GTGAAATACC 
CAGGCGCTCT TCCGCTTCCT 
GGCTGCGGCG AGCGGTATCA 
ACAGAATCAG GGGATAACGC 
AAAGGCCAGG AACCGTAAAA 
TCCGCCCCCC TGACGAGCAT 
CGAAACCCGA CAGGACTATA 
CCTCGTGCGC TCTCCTGTTC 
CCTTTCTCCC TTCGGGAAGC 
TATCTCAGTT CGGTGTAGGT 
ACCCCCCGTT CAUCCCGACC 
AGTCCAACCC GGTAAGACAC 
AACAGGATTA GCAGAGCGAG 
GTGGTGGCCT AACTACGGCT 
CTCTGCTGAA GCCAGTTACC 
GGCAAACAAA CCACCGCTGG 
GATTACGCGC AGAAAAAAAG 
CGGGGTCTGA CGCTCAGTGG 
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8351 AACGAAAACT CACGTTAAGG GATTTTGGTC ATGAGATTAT CAAAAAGGAT 
8401 CTTCACCTAG ATCCTTTTAA ATTAAAAATG AAGTTTTAAA TCAATCTAAA 
8451 GTATATATGA GTAAACTTGG TCTGACAGTT ACCAATGCTT AATCAGTGAG . 
8501 GCACCTATCT CAGCGATCTG TCTATTTCGT TCATCCATAG TTGCCTGACT 
5 8551 CCCCGTCGTG TAGATAACTA CGATACGGGA GGGCTTACCA TCTGGCCCCA 
8601 GTGCTGCAAT GATACCGCGA GACCCACGCT CACCGGCTCC AGATTTATCA 
8651 GCAATAAACC AGCCAGCCGG AAGGGCCGAG CGCAGAAGTG GTCCTGCAAC 
8701 TTTATCCGCC TCCATCCAGT CTATTAATTG TTGCCGGGAA GCTAGAGTAA 
8751 GTAGTTCGCC AGTTAATAGT TTGCGCAACG TTGTTGCCAT TGCTGCAGGT 
10 8801 CGGGAGCACA GGATGACGCC TAACAATTCA TTCAAGCCGA CACCGCTTCG 
8851 CGGCGCGGCT TAATTCAGGA GTTAAACATC ATGAGGGAAG CGGTGATCGC 
8901 CGAAGTATCG ACTCAACTAT CAGAGGTAGT TGGCGTCATC GAGCGCCATC 
8951 TCGAACCGAC GTTGCTGGCC GTACATTTGT ACGGCTCCGC AGTGGATGGC 
9001 GGCCTGAAGC CACACAGTGA TATTGATTTG CTGGTTACGG TGACCGTAAG 
15 9051 GCTTGATGAA ACAACGCGGC GAGCTTTGAT CAACGACCTT TTGGAAACTT 
9101 CGGCTTCCCC TGGAGAGAGC GAGATTCTCC GCGCTGTAGA AGTCACCATT 
9151 GTTGTGCACG ACGACATCAT TCCGTGGCGT TATCCAGCTA AGCGCGAACT 
9201 GCAATTTGGA GAATGGCAGC GCAATGACAT TCTTGCAGGT ATCTTCGAGC 
9251 CAGCCACGAT CGACATTGAT CTGGCTATCT TGCTGACAAA AGCAAGAGAA 
20 93 01 CATAGCGTTG CCTTGGTAGG TCCAGCGGCG GAGGAACTCT TTGATCCGGT 
9351 TCCTGAACAG GATCTATTTG AGGCGCTAAA TGAAACCTTA ACGCTATGGA 
9401 ACTCGCCGCC CGACTGGGCT GGCGATGAGC GAAATGTAGT GCTTACGTTG 
9451 TCCCGCATTT GGTACAGCGC AGTAACCGGC AAAATCGCGC CGAAGGATGT 
9501 CGCTGCCGAC TGGGCAATGG AGCGCCTGCC GGCCCAGTAT CAGCCCGTCA 
25 9551 TACTTGAAGC TAGGCAGGCT TATCTTGGAC AAGAAGATCG CTTGGCCTCG 
9601 CGCGCAGATC AGTTGGAAGA ATTTGTTCAC TACGTGAAAG GCGAGATCAC 
9651 CAAGGTAGTC GGCAAATAAT GTCTAACAAT TCGTTCAAGC CGACGCCGCT 
9701 TCGCGGCGCG GCTTAACTCA AGCGTTAGAT GCTGCAGGCA TCGTGGTGTC 
9751 ACGCTCGTCG TTTGGTATGG CTTCATTCAG CTCCGGTTCC CAACGATCAA 
30 9801 GGCGAGTTAC ATGATCCCCC ATGTTGTGCA AAAAAGCGGT TAGCTCCTTC 
9851 GGTCCTCCGA TCGAGGATTT TTCGGCGCTG CGCTACGTCC GCKACCGCGT 
9901 TGAGGGATCA AGCCACAGCA GCCCACTCGA CCTCTAGCCG ACCCAGACGA 
9951 GCCAAGGGAT CTTTTTGGAA TGCTGCTCCG TCGTCAGGCT TTCCGACGTT 
10001 TGGGTGGTTG AACAGAAGTC ATTATCGTAC GGAATGCCAA GCACTCCCGA 
35 10051 GGGGAACCCT GTGGTTGGCA TGCACATACA AATGGACGAA CGGATAAACC 
10101 TTTTCACGCC CTTTTAAATA TCCGTTATTC TAATAAACGC TCTTTTCTCT 
10151 TAGGTTTACC CGCCAATATA TCCTGTCAAA CACTGATAGT TTAAACTGAA 
10201 GGCGGGAAAC GACAATCTGA TCCCCATCAA GCTTGAGCTC AGGATTTAGC 
10251 AGCATTCCAG ATTGGGTTCA ATCAACAAGG TACGAGCCAT ATCACTTTAT 
40 10301 TCAAATTGGT ATCGCCAAAA CCAAGAAGGA ACTCCCATCC TCAAAGGTTT 
10351 GTAAGGAAGA ATTCTCAGTC CAAAGCCTCA ACAAGGTCAG GGTACAGAGT 
10401 CTCCAAACCA TTAGCCAAAA GCTACAGGAG ATCAATGAAG AATCTTCAAT 
104 51 CAAAGTAAAC TACTGTTCCA GCACATGCAT CATGGTCAGT AAGTTTCAGA 
10501 AAAAGACATC CACCGAAGAC TTAAAGTTAG TGGGCATCTT TGAAAGTAAT 
45 10551 CTTGTCAACA TCGAGCAGCT GGCTTGTGGG GACCAGACAA AAAAGGAATG 
10601 GTGCAGAATT GTTAGGCGCA CCTACCAAAA GCATCTTTGC CTTTATTGCA 
10651 AAGATAAAGC AGATTCCTCT AGTACAAGTG GGGAACAAAA TAACGTGGAA 
.10701 AAGAGCTGTC CTGACAGCCC ACTCACTAAT GCGTATGACG AACGCAGTGA 
10751. CGACCACAAA AGAATTCCCT CTATATAAGA AGGCATTCAT TCCCATTTGA 
50 10801 AGGATCATCA GATACTGAAC CAATCCTTCT AGAAGATCTA AGCTTAT 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 10901 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: Linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



1 
51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 



CGATAAGCTT 
TTCGATTGCT 
GCAATGGTGT 
CAACGCAAAT 
AGCTTATCCG 
TAATTGGCTC 
GCGTGCATGC 
CTCTGGTCTT 
ACAGGTCCTT 
GGTCTTTTGG 
TATGGGTGCC 
TTGGTAACGG 
GCTGCAACTG 
CGATAGCACT 
GTGTGTTGAA 
GGTGATCGTC 
CACCTACAGG 
TTGCTGGTCT 
ACTCGTGACC 
CGTTGAGACT 
GTAAGCTCAC 
GCTTTCCCAT 
CCTTAACGTT 
AGGAAATGGG 
GAAGACGTGG 
TGTTCCAGAA 
CTGTTGCAGC 
GAACTCCGTG 
CAAGCTCAAC 
GTGGTCGTCC 
GCTACCCACC 
CGTTTCTGAA 
GCTTCCCAGA 
CTCTCCGACA 
CGGATCCAGC 
CAATGCATCA 
ATTATGGCAT 
ATTTACTGTG 
TGGATGGAGA 
TTAATATTAT 
ATAAGAGATA 
GCCTCTAATG 
CATTATGCTT 
TGCAAATGTT 
GAACTTTCCT 
GCTTTATAAT 
TTTTTTAATG 
CGACCTGCAG 
AGCAGCATTC 
TATTCAAATT 
TTTGTAAGGA 
AGTCTCCAAA 
AATCAAAGTA 
AGAAAAAGAC 
AATCTTGTCA 



GATGTAATTG 
TCAATTGAAG 
GCAGAACCCA 
CTCCCTTATC 
ATTTCGTCGT 
TGAGCTTCGT 
TTCACGGTGC 
TCTGGAACCG 
CATGTTTGGA 
AAGGTGAAGA 
AGAATCCGTA 
TGGACTCCTT 
GTTGCCGTTT 
TTCATTGGTG 
CCCACTTCGC 
TTCCAGTTAC 
GTACCTATGG 
CAACACCCCA 
ACACTGAAAA 
GATGCTGACG 
CGGTCAAGTG 
TGGTTGCTGC 
TTGATGAACC 
TGCCGACATC 
CTGACTTGCG 
GACCGTGCTC 
TGCATTCGCT 
TTAAGGAAAG 
GGTGTTGATT 
TGACGGTAAG 
TCGATCACCG 
AACCCTGTTA 
GTTCATGGAT 
CTAAGGCTGC 
TTTCGTTCGT 
GTTTCATTGC 
TGGGAAAACT 
TTTTTTATTC 
AGAGTTAATG 
TTGTTTTTTC 
TGCAAACATT 
ACCGAAGTTA 
ATTCACTAGG 
ACTGAATACA 
TTATGTAATT 
TATAGTTATA 
CATTTTATGA 
CCACTCGAAG 
CAGATTGGGT 
6GTATCGCCA 
AGAATTCTCA 
CCATTAGCCA 
AACTACTGTT 
ATCCACCGAA 
ACATCGAGCA 



GAGGAAGATC 
TTTCTCCGAT 
TCTCTTATCT 
GGTTTCTCTG 
CGTGGGGATT 
CCTCTTAAGG 
AAGCAGCCGT 
TCCGTATTCC 
GGTCTCGCTA 
TGTTATCAAC 
AGGAAGGTGA 
GCTCCTGAGG 
GACTATGGGT 
ACGCTTCTCT 
GAAATGGGTG 
CTTGCGTGGA 
CTTCCGCTCA 
GGTATCACCA 
GATGCTTCAA 
GTGTGCGTAC 
ATTGATGTTC 
CTTGCTTGTT 
CAACCCGTAC 
GAAGTGATCA 
TGTTCGTTCT 
CTTCTATGAT 
GAAGGTGCTA 
CGACCGTCTT 
GCGATGAAGG 
GGTCTCGGTA 
TATCGCTATG 
CTGTTGATGA 
TTGATGGCTG 
TTGATGAGCT 
ATCATCGGTT 
GCACACACCA 
GTTTTTCTTG 
GGTTTTCGCT 
AATGATATGG 
TCTTATTTGT 
TTGTTTTGAG 
ATATGAGGAG 
CAACAAATAT 
AGTATGTCCT 
TTCCAGAATC 
CTCATGGATT 
CTTGCCAATT 
CGGCCGCGTT 
TCAATCAACA 
AAACCAAGAA 
GTCCAAAGCC 
AAAGCTACAG 
CCAGCACATG 
GACTTAAAGT 
GCTGGCTTGT 



AAAATTTTCA 
GGCGCAAGTT 
CCAATCTCTC 
AAGACGCAGC 
GAAGAAGAGT 
TCATGTCTTC 
CCAGCAACTG 
AGGTGACAAG 
GCGGTGAAAC 
ACTGGTAAGG 
TACTTGGATC 
CTCCTCTCGA 

CACTAAGCGT 
TGCAGGTGAA 
CCAAAGACTC 
AGTGAAGTCC 
CTGTTATCGA 
GGTTTTGGTG 
CATCCGTCTT 
CAGGTGATCC 
CCAGGTTCCG 
TGGTCTCATC 
ACCCACGTCT 
TCTACTTTGA 
CGACGAGTAT 
CCGTTATGAA 
TCTGCTGTCG 
TGAGACTTCT 
ACGCTTCTGG 
AGCTTCCTCG 
TGCTACTATG 
GTCTTGGAGC 
CAAGAATTCG 
TCGACAACGT 
GAATCCTACT 
TACCATTTGT 
ATCGAACTGT 
TCCTTTTGTT 
TGTGTGTTGA 
TAAAAATGTG 
TAAAACACTT 
ATTTT CAGAC 
CTTGTGTTTT 
CTTGTCAGAT 
TGTAG7TGAG 
GATTGACAAC 
CAAGCTTGAG 
AGGTACGAGC 
GGAACTCCCA 
TCAACAAGGT 
GAGATCAATG 
CATCATGGTC 
TAGTGGGCAT 
GGGGACCAGA 



ATCCCCATTC 
AGCAGAATCT 
GAAATCCAGT 
AGCATCCACG 
GGGATGACGT 
TGTTTCCACG 
CTCGTAAGTC 
TCTATCTCCC 
TCGTATCACC 
CTATGCAAGC 
ATTGATGGTG 
TTTCGGTAAC 
TTTACGATTT 
CCAATGGGTC 
GTCTGAAGAC 
CAACGCCAAT 
GCTGTTCTGC 
GCCAATCATG 
CTAACCTTAC 
GAAGGTCGTG 
ATCCTCTACT 
ACGTCACCAT 
TTGACTCTGC 
TGCTGGTGGA 
AGGGTGTTAC 
CCAATTCTCG 
CGGTTTGGAA 
CAAACGGTCT 
CTCGTCGTGC 
AGCAGCTGTC 
TTATGGGTCT 
ATCGCTACTA 
TAAGATCGAA 
AGCTCGGTAC 
TCGTCAAGTT 
GAGTTCGAGT 
TGTGCTTGTA 
GAAATGGAAA 
CATTCTCAAA 
ATTTGAAATT 
TCAAATCGTG 
GTAGTTGTAC 
CTAGAAAAGC 
AGACATTTAT 
TCTAATCATT 
TATGAAAATA 
ATGCATCAAT 
CTCAGGATTT 
CATATCACTT 
TCCTCAAAGG 
CAGGGTACAG 
AAGAATCTTC 
AGTAAGTTTC 
CTTTGAAAGT 
CAAAAAAGGA 
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2751 ATGGTGCAGA ATTGTTAGGC GCACCTACCA AAAGCATCTT TGCCTTTATT 
2801 GCAAAGATAA AGCAGATTCC TCTAGTACAA GTGGGGAACA AAATAACGTG 
2851 GAAAAGAGCT GTCCTGACAG CCCACTCACT AATGCGTATG ACGAACGCAG 
2901 TGACGACCAC AAAAGAATTC CCTCTATATA AGAAGGCATT CATTCCCATT 
5 2951 TGAAGGATCA TCAGATACTG AACCAATCCT TCTAGAAGAT CTAAGCTTAT 
3001 CGATAAGCTT GATGTAATTG GAGGAAGATC AAAATTTTCA ATCCCCATTC 
3051 TTCGATTGCT TCAATTGAAG TTTCTCCGAT GGCGCAAGTT AGCAGAATCT 
3101 GCAATGGTGT GCAGAACCCA TCTCTTATCT CCAATCTCTC GAAATCCAGT 
3151 CAACGCAAAT CTCCCTTATC GGTTTCTCTG AAGACGCAGC AGCATCCACG 
10 3201 AGCTTATCCG ATTTCGTCGT CGTGGGGATT GAAGAAGAGT GGGATGACGT 
3251 TAATTGGCTC TGAGCTTCGT CCTCTTAAGG TCATGTCTTC TGTTTCCACG 
3301 GCGTGCATGC AGGCcatggC TAAGATTTTT GATTTCGTAA AACCTGGCGT 
3351 AATCACTGGT GATGACGTAC AGAAAGTTTT CCAGGTAGCA AAAGAAAACA 
3401 ACTTCGCACT GCCAGCAGTA AACTGCGTCG GTACTGACTC CATCAACGCC 
15 3451 GTACTGGAAA CCGCTGCTAA AGTTAAAGCG CCGGTTATCG TTCAGTTCTC 
3501 CAACGGTGGT GCTTCCTTTA TCGCTGGTAA AGGCGTGAAA TCTGACGTTC 
3551 CGCAGGGTGC TGCTATCCTG GGCGCGATCT CTGGTGCGCA TCACGTTCAC 
3601 CAGATGGCTG AACATTATGG TGTTCCGGTT ATCCTGCACA CTGACCACTG 
3651 CGCGAAGAAA CTGCTGCCGT GGATCGACGG TCTGTTGGAC GCGGGTGAAA 
20 3 701 AACACTTCGC AGCTACCGGT AAGCCGCTGT TCTCTTCTCA CATGATCGAC 
3751 CTGTCTGAAG AATCTCTGCA AGAGAACATC GAAATCTGCT CTAAATACCT 
3801 GGAGCGCATG TCCAAAATCG GCATGACTCT GGAAATCGAA CTGGGTTGCA 
3851 CCGGTGGTGA AGAAGACGGC GTGGACAACA GCCACATGGA CGCTTCTGCA 
3 901 CTGTACACCC AGCCGGAAGA CGTTGATTAC GCATACACCG AACTG AG CAA 
25 3951 AATCAGCCCG CGTTTCACCA TCGCAGCGTC CTTCGGTAAC GTACACGGTG 
4001 TTTACAAGCC GGGTAACGTG GTTCTGACTC CGACCATCCT GCGTGATTCT 
4051 CAGGAATATG TTTCCAAGAA ACACAACCTG CCGCACAACA GCCTGAACTT 
4101 CGTATTCCAC GGTGGTTCCG GTTCTACTGC TCAGGAAATC AAAGACTCCG 
4151 TAAGCTACGG CGTAGTAAAA ATGAACATCG ATACCGATAC CCAATGGGCA 
30 4201 ACCTGGGAAG GCGTTCTGAA CTACTACAAA GCGAACGAAG CTTATCTGCA 
4251 GGGTCAGCTG GGTAACCCGA AAGGCGAAGA TCAGCCGAAC AAGAAATACT 
4301 ACGATCCGCG CGTATGGCTG CGTGCCGGTC AGACTTCGAT GATCGCTCGT 
4351 CTGGAGAAAG CATTCCAGGA ACTGAACGCG ATCGACGTTC TGTAAGAGCT 
4401 CGGTACCGGA TCCAATTccc GATCGTTCAA ACATTTGGCA ATAAAGTTTC 
35 4451 TTAAGATTGA ATCCTGTTGC CGGTCTTGCG ATGATTATCA TATAATTTCT 
4501 GTTGAATTAC GTTAAGCATG TAATAATTAA CATGTAATGC ATGACGTTAT 
4551 TTATGAGATG GGTTTTTATG ATTAGAGTCC CGCAATTATA CATTTAATAC 
4601 GCGATAGAAA ACAAAATATA GCGCGCAAAC TAGGATAAAT TATCGCGCGC 
4651 GGTGTCATCT ATGTTACTAG ATCGGGGATC GATCCCCGGG CGGCCGCCAC 
40 4701 TCGAGTGGTG GCCGCATCGA TCGTGAAGTT TCTCATCTAA GCCCCCATTT 
4751 GGACGTGAAT GTAGACACGT CGAAATAAAG ATTTCCGAAT TAGAATAATT 
4801 TGTTTATTGC TTTCGCCTAT AAATACGACG GATCGTAATT TGTCGTTTTA 
4851 TCAAAATGTA CTTTCATTTT ATAATAACGC TGCGGACATC TACATTTTTG 
4901 AATTGAAAAA AAATTGGTAA TTACTCTTTC TTTTTCTCCA TATTGACCAT 
45 4951 CATACTCATT GCTGATCCAT GTAGATTTCC CGGACATGAA GCCATTTACA 
5001 . ATTGAATATA TCCTGCCGCC GCTGCCGCTT TGCACCCGGT GGAGCTTGCA 
5051 TGTTGGTTTC TACGCAGAAC TGAGCCGGTT AGGCAGATAA TTTCCATTGA 
5101 GAACTGAGCC ATGTGCACCT TCCCCCCAAC ACGGTGAGCG ACGGGGCAAC 
5151 GGAGTGATCC ACATGGGACT TTTCCTAGCT TGGCTGCCAT TTTTGGGGTG 
50 5201 AGGCCGTTCG CGCGGGGCGC CAGCTGGGGG GATGGGAGGC CCGCGTTACC 
5251 GGGAGGGTTC GAGAAGGGGG GGCACCCCCC TTCGGCGTGC GCGGTCACGC 
5301 GCCAGGGCGC AGCCCTGGTT AAAAACAAGG TTTATAAATA TTGGTTTAAA 
5351 AGCAGGTTAA AAGACAGGTT AGCGGTGGCC GAAAAACGGG CGGAAACCCT 
5401 TGCAAATGCT GGATTTTCTG CCTGTGGACA GCCCCTCAAA TGTCAATAGG 
55 5451 TGCGCCCCTC ATCTGTCATC ACTCTGCCCC TCAAGTGTCA AGGATCGCGC 
5501 CCCTCATCTG TCAGTAGTCG CGCCCCTCAA GTGTCAATAC CGCAGGGCAC 
5551 TTATCCCCAG GCTTGTCCAC ATCATCTGTG GGAAACTCGC GTAAAATCAG 
5601 GCGTTTTCGC CGATTTGCGA GGCTGGCCAG CTCCACGTCG CCGGCCGAAA 
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5651 
5701 
5751 
5801 
5851 
5901 
5951 
6001 
6051 
6101 
6151 
6201 
6251 
6301 
6351 
6401 
6451 
6501 
6551 
6601 
6651 
6701 
6751 
6801 
6851 
6901 
6951 
7001 
7051 
7101 
7151 
7201 
7251 
7301 
7351 
7401 
7451 
7501 
7551 
7601 
7651 
7701 
7751 
7801 
7851 
7901 
7951 
8001 
8051. 
8101 
8151 
8201 
8251 
8301 
8351 
8401 
8451 
8501 



TCGAGCCTGC 
AAGTGTCAAC 
TGGTATCCAC 
ACGGCGTTTC 
GCGAGGGCAA 
CCTTGAGAGC 
ACTATCGTCG 
ACAGGTGCCG 
GGAGCGCGAC 
GCCCTCGCTC 
GAAGCAGGCC 
TGCTGGCGTT 
CTCGCTTCCG 
GCAGGTAGAT 
TTACCAGCCT 
GCCGCCTCGG 
CCTATACCTT 
CCACCTCGAC 
CTCCAAGAAT 
ACCAACCCTT 
CACGCGGCGC 
TCGTGCTCCT 
GGTTAGCAGA 
CTGCAAAACG 
GTGTTTCGTA 
TTCCGGATCT 
ATCTGTATTA 
CCCGCCGCAT 
CGGGCATGTT 
CATCGGTATC 
TCAAGTGACC 
GAAGCCAGAC 
GAACAGGCAG 
CCGCAGCTGC 
TGCAGCTCCC 
AGACAAGCCC 
AGCCATGACC 
TGCGGCATCA 
TACCGCACAG 
TCCTCGCTCA 
ATCAGCTCAC 
ACGCAGGAAA 
AAAAAGGCCG 
GCATCACAAA 
TATAAAGATA 
GTTCCGACCC 
AAGCGTGGCG 
AGGTCGTTCG 
GACCGCTGCG 
ACACGACTTA 
CGAGGTATGT 
GGCTACACTA 
TACCTTCGGA 
CTGGTAGCGG 
AAAGGATCTC 
GTGGAACGAA 
GGATCTTCAC 
TAAAGTATAT 



CCCTCATCTG 
GTCCGCCCCT 
AACGCCGGCG 
TGGCGCGTTT 
CCAGCCCGGT 
CTTCAACCCA 
CCGCACTTAT 
GCAGCGCTCT 
GATGATCGGC 
AAGCCTTCGT 
ATTATCGCCG 
CGCGACGCGA 
GCGGCATCGG 
GACGACCATC 
AACTTCGATC 
CGAGCACATG 
GTCTGCCTCC 
CTGAATGGAA 
TGGAGCCAAT 
GGCAGAACAT 
ATCTCGGGCA 
GTCGTTGAGG 
ATGAATCACC 
TCTGCGACCT 
AAGTCTGGAA 
GCATCGCAGG 
ACGAAGCGCT 
CCATACCGCC 
CATCATCAGT 
ATTACCCCCA 
AAACAGGAAA 
ATTAACGCTT 
ACATCTGTGA 
CTCGCGCGTT 
GGAGACGGTC 
GTCAGGGCGC 
CAGTCACGTA 
GAGCAGATTG 
ATGCGTAAGG 
CTGACTCGCT 
TCAAAGGCGG 
GAACATGTGA 
CGTTGCTGGC 
AATCGACGCT 
CCAGGCGTTT 
TGCCGCTTAC 
CTTTCTCATA 
CTCCAAGCTG 
CCTTATCCGG 
TCGCCACTGG 
AGGCGGTGCT 
GAAGGACAGT 
AAAAGAGTTG 
TGGTTTTTTT 
AAGAAGATCC 
AACTCACGTT 
CTAGATCCTT 
ATGAGTAAAC 



TCAACGCCGC 
CATCTGTCAG 
GCCGGCCGCG 
GCAGGGCCAT 
GAGCGTCGGA 
GTCAGCTCCT 
GACTGTCTTC 
GGGTCATTTT 
CTGTCGCTTG 
CACTGGTCCC 
GCATGGCGGC 
GGCTGGATGG 
GATGCCCGCG 
AGGGACAGCT 
ACTGGACCGC 
GAACGGGTTG 
CCGCGTTGCG 
GCCGGCGGCA 
CAATTCTTGC 
ATCCATCGCG 
GCGTTGGGTC 
ACCCGGCTAG 
GATACGCGAG 
GAGCAACAAC 
ACGCGGAAGT 
ATGCTGCTGG 
GGCATTGACC 
AGTTGTTTAC 
AACCCGTATC 
TGAACAGAAA 
AAACCGCCCT 
CTGGAGAAAC 
ATCGCTTCAC 
TCGGTGATGA 
ACAGCTTGTC 
GTCAGCGGGT 
GCGATAGCGG 
TACTGAGAGT 
AGAAAATACC 
GCGCTCGGTC 
TAATACGGTT 
GCAAAAGGCC 
GTTTTTCCAT 
CAAGTCAGAG 
CCCCCTGGAA 
CGGATACCTG 
GCTCACGCTG 
GGCTGTGTGC 
TAACTA7CGT 
CAGCAGCCAC 
ACAGAGTTCT 
ATTTGGTATC 
GTAGCTCTTG 
GTTTGCAAGC 
TTTGATCTTT 
AAGGGATTTT 
TTAAATTAAA 
TTGGTCTGAC 



GCCGGGTGAG 
TGAGGGCCAA 
GTGTCTCGCA 
AGACGGCCGC 
AAGGGTCGAT 
TCCGGTGGGC 
TTTATCATGC 
CGGCGAGGAC 
CGGTATTCGG 
GCCACCAAAC 
CGACGCGCTG 
CCTTCCCCAT 
TTGCAGGCCA 
TCAAGGATCG 
TGATCGTCAC 
GCATGGATTG 
TCGCGGTGCA 
CCTCGCTAAC 
GGAGAACTGT 
TCCGCCATCT 
CTGGCCACGG 
GCTGGCGGGG 
CGAACGTGAA 
ATGAATGGTC 
CAGCGCCCTG 
CTACCCTGTG 
CTGAGTGATT 
CCTCACAACG 
GTGAGCATCC 
TTCCCCCTTA 
TAACATGGCC 
TCAACGAGCT 
GACCACGCTG 
CGGTGAAAAC 
TGTAAGCGGA 
GTTGGCGGGT 
AGTGTATACT 
GCACCATATG 
GCATCAGGCG 
GTTCGGCTGC 
ATCCACAGAA 
AGCAAAAGGC 
AGGCTCCGCC 
GTGGCGAAAC 
GCTCCCTCGT 
TCCGCCTTTO 
TAGGTATCTC 
ACGAACCCCC 
C7TGAGTCCA 
TGGTAACAGG 
TGAAGTGGTG 
TGCGCTCTGC 
ATCCGGCAAA 
AGCAGATTAC 
TCTACGGGGT 
GGTCATGAGA 
AATGAAGTTT 
AGTTACCAAT 



TCGGCCCCTC 

GTTTTCCGCG 

CACGGCTTCG . 

CAGCCCAGCG 

CGACCGATGC 

GCGGGGCATG 

AACTCGTAGG 

CGCTTTCGCT 

AATCTTGCAC 

GTTTCGGCGA 

GGCTACGTCT 

TATGATTCTT 

TGCTGTCCAG 

CTCGCGGCTC 

GGCGATTTAT 

TAGGCGCCGC 

TGGAGCCGGG 

GGATTCACCA 

GAATGCGCAA 

CCAGCAGCCG 

GTGCGCATGA 

TTGCCTTACT 

GCGACTGCTG 

TTCGGTTTCC 

CACCATTATG 

GAACACCTAC 

TTTCTCTGGT 

TTCCAGTAAC 

TCTCTCGTTT 

CACGGAGGCA 

CGCTTTATCA 

GGACGCGGAT 

ATGAGCTTTA 

CTCTGACACA 

TGCCGGGAGC 

GTCGGGGCGC 

GGCTTAACTA 

CGGTGTGAAA 

CTCTTCCGCT 

GGCGAGCGGT 

TCAGGGGATA 

CAGGAACCGT 

CCCCTGACGA 

CCGACAGGAC 

GCGCTCTCCT 

TCCCTTCGGG 

AGTTCGGTGT 

CGTTCAGCCC 

ACCCGGTAAG 

ATTAGCAGAG 

GCCTAACTAC 

TGAAGCCAGT 

CAAACCACCG 

GCGCAGAAAA 

CTGACGCT.CA 

TTATCAAAAA 

TAAATCAATC 

GCTTAATCAG 
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6551 TGAGGCACCT ATCTCAGCGA TCTGTCTATT TCGTTCATCC ATAGTTGCCT 

8601 GACTCCCCGT CGTGTAGATA ACTACGATAC GGGAGGGCTT ACCATCTGGC 

8651 CCCAGTGCTG CAATGATACC GCGAGACCCA CGCTCACCGG CTCCAGATTT 

8701 ATCAGCAATA AACCAGCCAG CCGGAAGGGC CGAGCGCAGA AGTGGTCCTG 

5 8751 CAACTTTATC CGCCTCCATC CAGTCTATTA ATTGTTGCCG GGAAGCTAGA 

8801 GTAAGTAGTT CGCCAGTTAA TAGTTTGCGC AACGTTGTTG CCATTGCTGC 

8851 AGGTCGGGAG CACAGGATGA CGCCTAACAA TTCATTCAAG CCGACACCGC 

8901 TTCGCGGCGC GGCTTAATTC AGGAGTTAAA CATCATGAGG GAAGCGGTGA 

8951 TCGCCGAAGT ATCGACTCAA CTATCAGAGG TAGTTGGCGT CATCGAGCGC 

10 9001 CATCTCGAAC CGACGTTGCT GGCCGTACAT TTGTACGGCT CCGCAGTGGA 

9051 TGGCGGCCTG AAGCCACACA GTGATATTGA TTTGCTGGTT ACGGTGACCG 

9101 TAAGGCTTGA TGAAACAACG CGGCGAGCTT TGATCAACGA CCTTTTGGAA 

9151 ACTTCGGCTT CCCCTGGAGA GAGCGAGATT CTCCGCGCTG TAGAAGTCAC 

9201 CATTGTTGTG CACGACGACA TCATTCCGTG GCGTTATCCA GCTAAGCGCG 

15 9251 AACTGCAATT TGGAGAATGG CAGCGCAATG ACATTCTTGC AGGTATCTTC 

9301 GAGCCAGCCA CGATCGACAT TGATCTGGCT ATCTTGCTGA CAAAAGCAAG 

9351 AGAACATAGC GTTGCCTTGG TAGGTCCAGC GGCGGAGGAA CTCTTTGATC 

9401 CGGTTCCTGA ACAGGATCTA TTTGAGGCGC TAAATGAAAC CTTAACGCTA 

9451 TGGAACTCGC CGCCCGACTG GGCTGGCGAT GAGCGAAATG TAGTGCTTAC 

20 9501 GTTGTCCCGC ATTTGGTACA GCGCAGTAAC CGGCAAAATC GCGCCGAAGG 

9551 ATGTCGCTGC CGACTGGGCA ATGGAGCGCC TGCCGGCCCA GTATCAGCCC 

9601 GTCATACTTG AAGCTAGGCA GGCTTATCTT GGACAAGAAG ATCGCTTGGC 

9651 CTCGCGCGCA GATCAGTTGG AAGAATTTGT TCACTACGTG AAAGGCGAGA 

9701 TCACCAAGGT AGTCGGCAAA TAATGTCTAA CAATTCGTTC AAGCCGACGC 

25 9751 CGCTTCGCGG CGCGGCTTAA CTCAAGCGTT AGATGCTGCA GGCATCGTGG 

9801 TGTCACGCTC GTCGTTTGGT ATGGCTTCAT TCAGCTCCGG TTCCCAACGA 

9851 TCAAGGCGAG TTACATGATC CCCCATGTTG TGCAAAAAAG CGGTTAGCTC 

9901 CTTCGGTCCT CCGATCGAGG ATTTTTCGGC GCTGCGCTAC GTCCGCKACC 

9951 GCGTTGAGGG ATCAAGCCAC AGCAGCCCAC TCGACCTCTA GCCGACCCAG 

30 10001 ACGAGCCAAG GGATCTTTTT GGAATGCTGC TCCGTCGTCA GGCTTTCCGA 

10051 CGTTTGGGTG GTTGAACAGA AGTCATTATC GTACGGAATG CCAAGCACTC 

10101 CCGAGGGGAA CCCTGTGGTT GGCATGCACA TACAAATGGA CGAACGGATA 

10151 AACCTTTTCA CGCCCTTTTA AATATCCGTT ATTCTAATAA ACGCTCTTTT 

10201 CTCTTAGGTT TACCCGCCAA TATATCCTGT CAAACACTGA TAGTTTAAAC 

35 10251 TGAAGGCGGG AAACGACAAT CTGATCCCCA TCAAGCTTGA GCTCAGGATT 

10301 TAGCAGCATT CCAGATTGGG TTCAATCAAC AAGGTACGAG CCATATCACT 

10351 TTATTCAAAT TGGTATCGCC AAAACCAAGA AGGAACTCCC ATCCTCAAAG 

10401 GTTTGTAAGG AAGAATTCTC AGTCCAAAGC CTCAACAAGG TCAGGGTACA 
104 51 GAGTCTCCAA ACCATTAGCC AAAAGCTACA GGAGATCAAT GAAGAATCTT 
40 10501 CAATCAAAGT AAACTACTGT TCCAGCACAT GCATCATGGT CAGTAAGTTT 
10551 CAGAAAAAGA CATCCACCGA AGACTTAAAG TTAGTGGGCA TCTTTGAAAG 
10601 TAATCTTGTC AACATCGAGC AGCTGGCTTG TGGGGACCAG ACAAAAAAGG 
10651 AATGGTGCAG AATTGTTAGG CGCACCTACC AAAAGCATCT TTGCCTTTAT 
10701 TGCAAAGATA AAGCAGATTC CTCTAGTACA AGTGGGGAAC AAAATAACGT 
45 10751 GGAAAAGAGC TGTCCTGACA GCCCACTCAC TAATGCGTAT GACGAACGCA 
10801 GTGACGACCA CAAAAGAATT CCCTCTATAT AAGAAGGCAT TCATTCCCAT 
10851 TTGAAGGATC ATCAGATACT GAACCAATCC TTCTAGAAGA TCTAAGCTTA 
. 10901 T 
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CLAIMS 

1 . A recombinant, double-stranded DNA molecule containing 

a) a promoter functional in plant cells, and 

b) a DNA sequence coding for a polypeptide having the enzymatic activity of 
a fructose- 1 ,6-bisphosphate aldolase and operatively linked to the promoter 
in sense orientation. 

2. The DNA molecule according to claim 1, wherein the DNA sequence 
coding for a polypeptide having the enzymatic activity of a fructose- 1,6- 
bisphosphate aldolase is derived from a prokaryotic organism. 

3 . The DNA molecule according to claim 2, wherein the prokaryotic organism is 
Escherichia coli. 

4. The DNA molecule according to claim 1 , wherein the DNA sequence coding for a 
polypeptide having the enzymatic activity of a fructose- 1 ,6-bisphosphate aldolase 
has at least about 60% identity with a prokaryotic DNA sequence coding for 
fructose- 1,6-bisphosphate aldolase class II. 

5. The DNA molecule according to claim 1, wherein the DNA sequence coding for 
the polypeptide having the enzymatic activity of a fructose- 1,6-bisphosphate 
aldolase is a sequence capable of hybridizing with the coding region depicted as 
SEQ ID NO. 1. 

6. The DNA molecule according to claim 1, wherein the DNA sequence coding for a 
polypeptide having the enzymatic activity of a fructose- 1 ,6-bisphosphate aldolase 
has at least about 60% identity with the coding region depicted as SEQ ID NO. 1 . 

7. The DNA molecule according to claim 1 , wherein the DNA sequence coding for a 
polypeptide having the enzymatic activity of a fructose- 1,6-bisphosphate aldolase 
has at least about 70% identity with the coding region depicted as SEQ ID NO. 1. 
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The DNA molecule according to claim 1, wherein the DNA sequence coding for a 
polypeptide having the enzymatic activity of a fructose- 1,6-bisphosphate .aldolase 
has at least about 80% identity with the coding region depicted as SEQ ID NO. 1 . 

The DNA molecule according to claim 1, wherein the DNA sequence coding for 
the polypeptide having the enzymatic activity of a fructose- 1 ,6-bisphosphate 
aldolase has the coding region depicted as SEQ ID NO. 1, or encodes the same 
peptide as SEQ ID NO. 1 in accordance with the degeneracy of the genetic code. 

A transgenic plant cell containing in its genome a recombinant DNA molecule 
according to any of claims 1-9. 

A transgenic plant containing plant cells according to claim 10. 

The transgenic plant of claim 1 1 , wherein the plant exhibits a property selected 
from the group consisting of increased photosynthesis rates, increased yields, 
increased growth rates and improved solids uniformity compared with plants that 
do not contain the recombinant DNA molecule. 

The transgenic plant according to claim 1 1 , which is a crop plant. 

The transgenic plant according to claim 11, selected from the group consisting of 
corn, wheat, rice, tomato, potato, carrots, sweet potato, yams, artichoke, alfalfa, 
peanut, barley, cotton, soybean, canola, sunflower, sugarbeet, apple, pear, orange, 
peach, sugarcane, strawberry, raspberry, banana, grape, plantain, tobacco, lettuce, 
cassava, cruciferous vegetables, forestry species and horticultural species. 

The transgenic plant of claim 1 1, wherein the plant is a potato. 

A food product derived from the potato of claim 15. 

The food product of claim 16, which is a french fry or a potato chip. 
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1 8. Propagation material derived from the transgenic plant of claim 1 1 . 

19. A process for increasing the photosynthesis rate in plants which comprises 
transforming plant cells with a DNA molecule according to any one of claims 1 to 
9, and regenerating the transformed cells to produce a transgenic plant. 

20. A process for increasing the yield in plants which comprises transforming plant 
cells with a DNA molecule according to any one of claims 1 to 9, and regenerating 
the transformed cells to produce a transgenic plant. 

21 . A process for increasing the growth rate in plants which comprises transforming 
plant cells with a DNA molecule according to any one of claims 1 to 9, and 
regenerating the transformed cells to produce a transgenic plant 

22. A process for improving the solids uniformity in plants which comprises 
transforming plant cells with a DNA molecule according to any one of claims 1 to 
9, and regenerating the transformed cells to produce a transgenic plant. 

23. In a method for the processing of potatoes into fries or chips, the improvement 
comprising, utilizing a potato that overexpresses the fda transgene providing a 
higher solids uniformity in such potato. 



61 



PCTAJS98/12447 

WO 98/58069 

1/11 



TGCAACTTGAAGTATGACGAGTATAAC3GCCCGACGATACAGGACAAGAGACATGTCT 

AAG 

MetSerLys 



ATTTTTGATTTCGTAAAACCTGGCGTAATCACTGGTGATGACGTACAGAAAGTTTTCCAG 
IlePheAspPheValLysProGlyVallleThrGlyAspAspValGlnLysValPheGln 



GTAGCAAAAGAAAACAACTTCGCACTGCCAGCAGTAAACTGCGTCGGTACTGACTCCATC 
ValAlaLysGluAsnAsnPheAlaLeuProAlaValAsnCysValGlyThrAspSerlle 



AACGCCGTACTGGAAACCGCTGCTAAAGTTAAAGCGCCGGTTATCGTTCAGTTCTCCAAC 
AsnAlaValLeuGluThrAlaAlaLysValLysAlaProVallleValGlnPheSerAsn 



GGTGGTGCTTCCTTTATCGCTGGTAAAGGCGTGAAATCTGACGTTCCGCAGGGTGCTGCT 
GlyGlyAlaSerPhelleAlaGlyLysGlyValLysSerAspValProGlnGlyAlaAla 



ATCCTGGGCGCGATCTCTGGTGCGCATCACGTTCACCAGATGGCTGAACATTATGGTGTT 
IleLeuGlyAlalleSerGlyAlaHisHisValHisGlnMetAlaGluHisTyrGlyVal 



CCGGTTATCCTGCACACTGACCACTGCGCGAAGAAACTGCTGCCGTGGATCGACGGTCTG 
ProVallleLeuHisThrAspHisCysAlaLysLysLeuLeuProTrpIleAspGlyLeu 



TTGGACGCGGGTGAAAAACACTTCGCAGCTACCGGTAAGCCGCTGTTCTCTTCTCACATG 
LeuAspAlaGlyGluLysHisPheAlaAlaThrGlyLysProLeuPheSerSerHisMet 



ATCGACCTGTCTGAAGAATCTCTGCAAGAGAACATCGAAATCTGCTCTAAATACCTGGAG 
IleAspLeuSerGluGluSerLeuGlnGluAanlleGluIleCysSerLysTyrLeuGlu 



CGCATGTCCAAAATCGGCATGACTCTGGAAATCGAACTGGGTTGCACCGGTGGTGAAGAA 
ArgMetSerLysIleGlyMetThrLeuGluIleGluLeuGlyCysThrGlyGlyGluGlu 



GACGGCGTGGACAACAGCCACATGGACGCTTCTGCACTGTACACCCAGCCGGAAGACGTT 
AspGlyValAspAsnSerHisMetAspAlaSerAlaLeuTyrThrGlnProGl\iAspVal 



GATTACGCATACACCGAACTGAGCAAAATCAGCCCGCGTTTCACCATCGCAGCGTCCTTC 
AspTyrAlaTyrThrGluLeuSerLysIleSerProArgPheThrlleAlaAlaSerPhe 

FIG. 1A 
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GGTAACGTACACGGTGTTTACAAGCCGGGTAACGTGGTTCTGACTCCGACCATCCTGCGT 
GlyAsnValHisGlyValTyrLysProGlyAsnValValLeuThrProThrlleLeuArg 

GATTCTCAGGAATATGTTTCCAAGAAACACAACCTGCCGCACAACAGCCTGAACTTCGTA 
AspSerGlnGluTyrValSerLysLysHisAsnLeuProHisAsnSerLeuAsnPheVal 

TTCCACGGTGGTTCCGGTTCTACTGCTCAGGAAATCAAAGACTCCGTAAGCTACGGCGTA 
PheHiaGlyGlySerGlySerThrAlaGlnGluIleLysAspSerValSerTyrGlyVal 



GTAAAAATGAACATCGATACCGATACCCAATGGGCAACCTGGGAAGGCGTTCTGAACTAC 
ValLysMetAsnlleAspThrAspThrGlnTrpAlaThrTrpGluGlyValLeuAsnTyr 



TACAAAGCGAACGAAGCTTATCTGCAGGGTCAGCTGGGTAACCCGAAAGGCGAAGATCAG 
TyrLysAlaAsnGluAlaTyrLeuGlnGlyGlnLeuGlyAsnProLysGlyGluAspGln 



CCGAACAAGAAATACTACGATCCGCGCGTATGGCTGCGTGCCGGTCAGACTTCGATGATC 
ProAsnLysLysTyrTyrAspProArgValTrpLeuArgAlaGlyGlnThrSerMetlle 



GCTCGTCTGGAGAAAGCATTCCAGGAACTGAACGCGATCGACGTTCTGTAAGATATT 

CCT 

AlaArgLeuGluLysAlaPheGlnGluLeuAsnAlalleAspValLeuEnd 
TTCTGCTTATCTCAAGGCCCGCTCTGCGGGTCTTTTTTTCG 



FIG. IB 
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